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R ference To A Related Application 

Th present application is a continuation-in-part of our cop nding U.S. Pat nt Application Serial No. 
07/866.045, filed on April 9. 1992, wfiicfi is incorporated by reference in its entirety. 

5 

Background of the Invention 

Tfie present invention concerns non-A, non-B hepatitis (hereinafter called NANB hepatitis) virus 
genome, polynucleotides, polypeptides, related antigen, antibody and detection systems for detecting 
70 NANB antigens or antibodies. 

Viral hepatitis of which DNA and RNA of the causative viruses have been elucidated, and their 
diagnosis and even prevention In some have been established, are hepatitis A and hepatitis B. The general 
name NANB hepatitis was given to the other forms of viral hepatitis. 

Post-transfusion hepatitis was remarkably reduced after introduction of diagnostic systems for screening 
75 hepatitis B in transfusion bloods. However, there are still an estimated 280,000 annual cases of post- 
transfusion hepatitis caused by NANB hepatitis in Japan. 

NANB hepatitis viruses were recently named C.D and E according to their types, and scientists started 
a world wide effort to conduct research for the causative viruses and subsequent extermination of the 
causative viruses. 

20 In 1988, Chiron Corp. claimed that they had succeeded in cloning RNA virus genome, which they 
termed hepatitis C virus (hereinafter called HCV), as the causative agent of NANB hepatitis and reported on 
its nucleotide sequence (British Patent 2,212,511 which is the equivalent of European Patent Application 
0,318,216). HCV (CI 00-3) antibody detection systems based on the sequence are now being introduced for 
screening of transfusion bloods and for diagnosis of patients in Japan and in many other countries. The 

25 detection systems for the C100-3 antibody have proven their partial association with NANB hepatitis; 
however, they capture only about 70% of carriers and chronic hepatitis patients, or they fail to detect the 
antibody in acute phase infection, thus leaving problems yet to be solved even after development of the 
CI 00-3 antibody by Chiron Corp. 

The course of NANB hepatitis is troublesome and most patients are considered to become carriers. 

30 then to develop chronic hepatitis. In addition, most patients with chronic hepatitis develop liver cirrhosis, 
then hepatocellular carcinoma. It is therefore very Imperative to isolate the virus itself and to develop 
effective diagnostic reagents enabling earlier diagnosis. 

The presence of a number of NANB hepatitis which cannot be diagnosed by Chiron's CI 00-3 antibody 
detection kits suggests a possibility of a difference in subtype between Chiron's HCV and Japanese NANB 

35 hepatitis virus. 

In order to develop NANB hepatitis diagnostic kits of more specificity and to develop effective vaccines, 
it becomes an absolutely important task to analyze each subtype of NANB hepatitis causative virus at its 
genetic and corresponding amino acid level. 

40 Summary of the Invention 



An object of the present invention is to provide the nucleotide sequence coding for the structural protein 
of NANB hepatitis virus and, with such information, to analyze amino acids of the protein to locate and 
provide polypeptides useful as antigen for establishment of detection systems for NANB virus, its related 

45 antigens and antibodies. 

A further object of the present invention is to locate polynucleotides essential to treatment, prevention 
and diagnosis, and polypeptides effective as antigens, by isolating NANB hepatitis virus RNA from human 
and chimpanzee virus carriers, cloning the cDNA covering the whole structural gene of the virus to 
determine its nucleotide sequence, and studying the amino acid sequence of the cDNA. As a result, the 

50 inventors have determined the nucleotides of the whole genome of a strain of NANB virus called HC-J6 and 
a strain called HC-J8. NANB hepatitis virus genome of HC-J6 and HC-J8 differ from that of Chiron's HCV. 

Brief Description of the Drawings 

55 Figure 1 shows the restriction map and structure of the coding region of NANB hepatitis virus genome 
(HC-J6) and positions of clones. C, E. NS-1, NS-2. NS-3. NS-4 and NS-5 are th abbreviation of core, 
nvelope, non-structure-1 , -2, -3, -4 and -5. 
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Rgures 2 to 4 show method of determination of the nucleotide sequence of 5' t rminus of NANB 
h patitis virus genom of strains HC-J1. HC-J4 and HC-J6 respectively. 

Rgur 5 shows the m thod of d t rmination of th nucleotid sequence of 3' terminus of HC-J6 
genome. Solid lines show nucleotide sequences determined by clones from libraries of bacteriophage 
s lambda gt10, and broken lines show nucleotide sequences determined by clones obtained by PGR. 

Rgure 6 shows the structure of coding region of NANB hepatitis virus genome {HC-J8) and positions of 
clones. Regions a to n indicate positions of amplification by PGR. 

Detailed Description of the Invention 

70 

The present invention provides NANB hepatitis virus genome RNA for strain HC-J6 (sequence list 1) 
consisting of 340 nucleotides on the 5' terminus that follow an open reading frame consisting of 9099 
nucleotides coding for the structural protein and non-structural protein that follow a noncoding region 
consisting of 150 nucleotides containing an U-stretch consisting of 108 uracils on the 3' terminus of NANB 
75 hepatitis virus, and NANB hepatitis virus genome having substantially the nucleotide sequence of sequence 
list 1. 

The present invention provides polynucleotide N-9589 (strain HG-J6) comprising the DNA nucleotide 
sequence of sequence list 2; cDNA clone J6-081 comprising the nucleotide sequence of sequence list 3; 
cDNA clone J6-08 comprising the nucleotide sequence of sequence list 4; and NANB hepatitis virus 
20 polynucleotides having substantially the sequence of nucleotides of NANB hepatitis virus nucleotides shown 
in sequence lists 2 through 4. 

The invention provides polypeptide coded for by genome or polynucleotide of HG-J6 above, polypep- 
tide P-J6-3033, comprising the polypeptide sequence of sequence list 5, polypeptides produced by using 
recombinant genome, recombinant polynucleotides and recombinant cDNA of whole or a part of cDNA 
25 above, and polyclonal or monoclonal antibodies against the polypeptides described above. 

The present invention also provides NANB hepatitis virus genome for strain HG-J8 comprising 
sequence list 6, NANB hepatitis virus RNA consisting of noncoding region consisting of 341 nucleotides on 
5* terminus followed by an open reading frame consisting of 9099 nucleotides coding for the structural 
protein and non-structural protein followed by a noncoding region consisting of 71 nucleotides containing an 
30 U-stretch consisting of 30 uracils on 3' terminus of NANB hepatitis virus comprising sequence list 6. and 
NANB hepatitis virus genome having substantially the nucleotide sequence of sequence list 6. 

The present invention provides polynucleotide N-9511 for strain HG-J8 comprising the DNA nucleotide 
sequence of sequence list 7 and NANB hepatitis virus polynucleotide having substantially the sequence of 
nucleotides of NANB hepatitis virus nucleotides comprising sequence list 7. 
35 The invention provides polypeptide coded for by genome or polynucleotide of HC-J8 above, polypep- 
tide P-J8-3033, comprising the polypeptide sequence of sequence list 8 and polypeptide P-J8-3033-2 
comprising the polypeptide sequence of sequence list 9, polypeptides produced by using recombinant 
genome, recombinant polynucleotides and recombinant cDNA of whole or a part of cDNA above, and 
polyclonal or monoclonal antibodies against the polypeptides described above. 
40 The present invention, furthermore, provides NANB hepatitis diagnostic system using polypeptides or 
antibodies described above. 

In the method described below, NANB hepatitis virus RNA of the present invention was obtained and its 
nucleotide sequence was determined. 

Plasma samples (HG-J1 , HG-J4, HG-J6 and HG-J8) were obtained from human and chimpanzee. HG- 
45 J1, HG-J6 and HG-J8 were obtained from Japanese blood donors who had tested positive for HGV 
antibody. HG-J4 was obtained from the chimpanzee subjected to the challenge test but was negative for 
Ghiron's CI 00-3 antibody previously mentioned. 

RNA was isolated from each of the plasma samples. Following the study of 5* terminus of approxi- 
mately 2,500 nucleotides and 3* terminus of approximately 1,100 nucleotides disclosed in Japanese patent 
50 application No. 196175/91, the inventors have completed the study of the region coding for non-structural 
protein of strain HG-J6 and the study of the full length sequence of 9,589 nucleotides of HG-J6 genome 
RNA and have completed the study of th region coding for non-structural protein of strain HG-J8 and the 
study of the full length sequence of 9,589 nucleotides of HG-J8 genome RNA. 

As described in the Example below, strain HG-J6 had a 5* noncoding region consisting of 340 
55 nucleotides, and strain HG-J8 had a 5' noncoding region consisting of 341 nucleotides, followed by region 
coding for structural protein and region coding for non-structural protein. 

Concerning the 3* terminus, strain HC-J6 was found to have a region consisting of 150 nucleotides 
containing an U-stretch consisting of 108 uracils following after the region coding for non-structural protein 
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and strain HC-J8 was found to hav a region consisting of 71 nucleotides containing an U-stretch consisting 
of 30 uracils following after th region coding for non-structural protein. 

Th coding region starting with ad nine (341st nucl otid from ttie 5' terminus for strain HC-J6 and 
342nd nucleotide from the 5' terminus for strain HC-J8) was found to have a long Open Reading Frame 
5 consisting of 9099 nucleotides which codes for 3033 amino acids. HCV or hepatitis C virus is supposed to 
be closely allied to flavivirus in regard to its genetic structure. The coding of the NANB hepatitis virus 
genome of the present invention was considered to be consisting of regions named C (core), E (envelope), 
NS-1 (non-structural-1), NS-2 (non-structural-2). NS-3 (non-structural-3), NS-4 (non-structural-4) and NS-5 
(non-structural-5). 

10 As compared with the sequence of HCV disclosed in the European Patent Application by Chiron Corp. 
(Publication No. 388,232), homology of sequences of the strain HC-J6 was 67.9% for the full nucleotide 
sequence and 72.3% for the full amino acid sequence, and homology of sequences of the strain HC-J8 was 
66.4% for the full nucleotide sequence and 71.0% for the full amino acid sequence. 

From an examination of homology for regions, the homology of nucleotide sequences (strain HC-J6) of 

75 the 5* terminal noncoding region was 94.4% and that of the amino acid sequences of the C region was 
90.1%, showing comparatively high homology; on the other hand, concerning lower stream than envelope, 
homologies of amino acid sequence were found to be as low as 60.4% for E. 71.1% for NS-1. 57.8% for 
NS-2. 81.1% for NS-3, 73.1% for NS-4. and 69.9% for NS-5. As a result, HC-J6 strain was found to be 
significantly different from HCV strain found by Chiron Corp. 

20 From an examination of homology for regions, the homology of nucleotide sequences (strain HC-J8) of 
the 5' terminal noncoding region was 93.8% and that of the amino acid sequences of the C region was 
90.1%, showing comparatively high homology; on the other hand, concerning lower stream than envelope, 
homologies of amino acid sequence were found to be as low as 54.7% for E. 73.1% for NS-1. 55.6% for 
NS-2, 81.3% for NS-3. 72.1% for NS-4, 67.3% for NS-5. and 25.9% for 3* terminal noncoding region. As a 

25 result. HC-J8 strain was found to be significantly different from HCV strain found by Chiron Corp. 

From the comparison of amino acid sequence of HC-J6 strain with strain HC-J1 (American type) and 
strain HC-J4 (Japanese type) disclosed by the inventors (Japan. J. Exp. Med. (1990). 60: 167-177), 
homology in the core region was more than 90% for each strain while that in the envelope region was 
60.9% for HC-J1 and 53.1% for HC-J4. Thus, in the present invention, strain HC-J6 was found to be a 

30 different type of virus than strains HC-J1 or HC-J4. 

From the comparison of amino acid sequence of HC-J8 strain with strain HC-J1 (type I) and strain HC- 
J4 (type II). homology of approximately 3,000 nucleotides of 5' terminus was 70.1% for HC-J1 and 67.1% 
for HC-J4, and from the comparison of all nucleotides with HC-J6 (type III) genome homology was as low 
as 76.9%. On the other hand. HC-J8 showed high homology with strain HC-J7 (type IV) disclosed in 

35 Japanese patent application 196175/91 as 93.1% for approximately 3,000 nucleotides of 5' terminus. 

Nucleotides among stains assumed to belong to same type were supposed to show high homology. For 
example, homology of 95.6% for approximately 3,000 nucleotides of 5' terminus between HCV disclosed by 
Chiron Corp. and HC-J1 appears to show that they should be classified into type I. On the other hand, low 
homology of HC-J8 with HCV. HC-J1 . HC-J4 and HC-J6 appeared to show that it was not to be classified 

40 into type I. II or III. but into type IV (the same as HC-J7). 

Strain HC-J8 has some mutations in the nucleotides as shown in sequence lists 6 and 7 by symbols M. 
R, W, S, Y, K and B. It also can be easily understood that it has some mutations of amino acids from 
comparison of sequences in sequences lists 8 and 9. Mutation of nucleotides was observed up to 
approximately 1 .4% in the whole genome and that of amino acids was observed up to approximately 1 .7% 

45 in whole ORF, Thus the present invention includes genomes, polynucleotides and polypeptides of strain 
HC-J8 having some mutations. 

In addition, envelope (E) region (576 nucleotides/192 amino acids of amino acids 192-383) and NS-1 
region (1050 nucleotides/350 amino acids of amino acids 384-733) having many mutations in HC-J8 are 
called hyper-variable region since mutations were observed as 20 nucleotides/7 amino acids (3.47%/3.64%) 

50 in E region and 37 nucleotides/19 amino acids (3.52%/5.42%) in NS-1 region. According to these findings, 
the present invention can be recognized to include genomes and polypeptides coded for by the genomes 
of strain HC-J8 having mutations of 3.5% to 5.5% in those regions. 

The genome, polynucleotide, and cDNA clones of the present invention can be used as material to 
produce peptides of the invention by integration into a host genome, e.g. £ coU or Bacillus, by means of 

55 known genetic engineering techniques. 

Polypeptides of the inv ntion are useful as material for diagnostic agents to detect NANB hepatitis 
antibodies with high specificity and as material to produce polyclonal and monoclonal antibodies by known 
techniques. 
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Polyclonal and monoclonal antibodi s of the invention ar useful as materials for diagnostic ag nts to 
d tect NANB hepatitis antig ns with high specificity. 

A det ction system using each polypeptide of the pr s nt inv ntion or polypeptid with partial 
replacement of amino acids, and a detection system using monoclonal or polyclonal antibodies to such 
5 polypeptides, are useful as diagnostic agents of NANB hepatitis with high specificity and are effective to 
screen out NANB hepatitis virus from transfusion bloods or blood derivatives. The polypeptides, or 
antibodies to such polypeptides, can be used as a material for a vaccine against NANB hepatitis virus. 

It is well known in the art that one or more nucleotides in a ONA sequence can be replaced by other 
nucleotides in order to produce the same protein. The present invention also concerns such nucleotide 
10 substitutions which yield DNA sequences which code for polypeptides as described above. It is also well 
known in the art that one or more amino acids in an amino acid sequence can be replaced by equivalent 
other amino acids, as demonstrated by U.S. Patent No. 4,737,487 which is incorporated by reference, In 
order to produce an analog of tiie amino acid sequence. Any analogs of the polypeptides of the present 
invention involving amino acid deletions, amino acid replacements, such as replacements by other amino 
75 acids, or by isosteres (modified amino acids tiiat bear close structural and spatial similarity to protein amino 
acids), amino acid additions, or isosteres additions can be utilized, so long as the sequences elicit 
antibodies recognizing NANB antigens. 

Examples of application of this invention are shown below, however, the invention shall in no way be 
limited to those examples. 

20 

Examples 



The 5' terminal nucleotide sequence and amino acid sequence of NANB hepatitis virus genome were 
determined in the following way: 

25 

(1) Isolation of RNA 

RNA of the sample (HC-J1 . HC-J6. HC-J8) from plasma of Japanese blood donor testing positive for 
HCV (C100-3) antibody (by Ortho HCV Ab ELISA. Ortho Diagnostic System, Tokyo), and that of the sample 
30 (HC-J4) from the chimpanzee challenged with NANB hepatitis for infectivity and negative for HCV antibody 
were isolated in the following method: 

Each plasma sample was added with Tris chloride buffer (10 mM, pH 8.0) and centrifuged at 68 x 10^ 
rpm for 1 hour. Its precipitate was suspended in Tris chloride buffer (50 mM. pH 8.0) containing 200 mM 
NaCI, 10 mM EDTA, 2% (w/v) sodium dodecyl sulfate (SDS). and proteinase K 1 mg/ml. incubated at 60'C 
35 for 1 hour, tiien their nucleic acids were extracted by phenol/chloroform and precipitated by ethanol to 
obtain RNA. 

(2) HC-J1 and HC-J8 cDNA Synthesis 

40 After heating the RNA isolated from HC-J1 or HC-J8 plasma at 70 * C for 1 minute, this was used as a 
template; 10 units of reverse transcriptase (cDNA Synthesis System Plus. Amersham Japan) and 20 pmol 
of oligonucleotide primer (20 mer) were added and incubated at 42* C for 1.5 hours to obtain cDNA. Primer 
#8 (5'- GATGCTTGCGGAAGCAATCA - 3*) was prepared by referring to the basic sequence shown in 
European Patent Application No. 88310922.5. which is relied on and incorporated herein by reference. 

45 

(3) cDNA Was Amplified by the following Polymerase Chain Reaction (PGR) 

cDNA was amplified for 35 cycles according to Saiki's method (Science (1988) 239: 487-491) using 
Gene Amp DNA Amplifier Reagent (Perkin-Elmer.Cetus) on a DNA Thermal Cycler (Perkin-Elmer.Cetus). 
50 For cDNA synthesis and for PCR for HC-J8, synthesized primers disclosed in Japanese patent 
application 153402/90 and those based on HC-J1, HC-J4 and HC-J6 genomes disclosed in Japanese patent 
applications 196175/91 and b low were utilized. 

(4) Determination of 5* Terminal NucI otide Sequence of HC-J1 and HC-J4 by Assembling cDNA Clones 
55 ~~~ 

As shown in Rgur s 2 and 3, nucleotide sequences of 5' termini of the genomes of strains HC-J1 and 
HC-J4 were determined by combined analysis of clones obtained from the cDNA library constiucted in 
bacteriophage Xgt10 and clones obtained by amplification of HCV specific cDNA by PCR. 
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Rgures 2 and 3 show 5' t rmini of NANB hepatitis virus g nom togeth r with cl avag site by 
restriction ndonuclease and sequenc of primers used. In the figur s, solid lines are nucl otid sequ nc s 
determin d by cion s from bacteriophag Xgt10 library whil dotted lin s show sequ nc s determined by 
clones obtained by PGR. 

5 A 1656 nucleotide sequence of HC-J1 spanning nt454^2109 was determined by clone 041 which was 
obtained by inserting the cDNA synthesized with the primer #8 into Xgt10 phage vector (Amersham). 

Another primer #25 (5'- TCCCTGTTGCATAGTTCACG -3') corresponding to nt824-843 was synthesized 
based on the 041 sequence, and four clones (060, 061 , 066 and 075) were otitained to cover the upstream 
sequence nt1 8-843. 

10 

(5) Determination of 5' Terminal Nucleotide Sequence of HC-J6. 

The nucleotide sequence of the 5* terminus of strain HC-J6 was determined from analysis of clones 
obtained by PGR amplification as shown in Rgure 4. 
75 Isolation of RNA from HG-J6 and determination of its sequence was made in the same manner as 
described in (2) above. Sequences in the range of nt24-2551 of the RNA were determined from consensus 
sequence of respective clones obtained by amplification by PGR using each pair of primers based on 
nucleotide sequence of HC-J4. 



nt24-826 

»32 { 5 ' -ACTCCACCATAGATCACTCC-3 • ) 
#122 ( 5 • -AGGTTCCCTGTTGCATAATT-3 • ) 
Clones: C9397, C9388, C9764 



nt732-1907 

#50 ( 5 ' -GCCGACCTCATGGGGTACAT-3 ' ) 
#128 { 5 • -TCGGTCGTGCCC ACTACC AC - 3 ' ) 



Clones: C9316 , C9752 , C9753 

40 



ntl847-2571 

#149 ( 5 ' -TCTGTGTGTGGCCCAGTGTA-3 • ) 
#146 ( 5 ' -AGTAGCATCATCCACAAGCA-3 * ) 
Clones: C11621 ,C11624,C11655 



In order to determin further upstream of the 5* terminus, antisens primer #36 (5'- AACACTACTGGG- 
GTAGGAGT -3*) corresponding to nt246-265, followed by dAs were added to 5* terminus of cDNA using 
terminal deoxynucleotidyl transf rase, and one-sided PGR amplitication was mad twice as described 
55 below. 

cDNA was amplified for 35 cycles as first stage PGR using oligo dT primer (20-mer) and antis ns 
primer #48 (5*-GTTGATGGAAGAAAGGACGG -3") of nt1 88-207. followed by th second stag of PGR by 30 
cycle amplification using th first PGR product as a template, oligo dT primer (20 -mer) and antisense 
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primer #109 (21 -met; 5'-ACCGGATCCGCAGACCACTAT-3') corresponding to nt140 to 160. The obtained 
PGR product was subcloned to Ml 3 phage vector. 

NucI otid sequ nee from ntl to 23 was determined from cons nsus sequenc of 13 isolated clones 
C9577. C9579. C9581. C9587, C9590, C9591. C9595. C9606, C9609. C9615. C9616 and C9619 obtained 
5 above which were considered having complete 5' terminus. 

(6) Determina tion of nucleotide sequence of HC-J6 middle region. 

cDNA library was constructed with using XgtIO according to the method described in (2) above from 
70 1 00ml of HC-J6 plasma as a starting materials. Primers #162 and #81 were prepared for synthesis by 
referring to the basic sequence shown in the European Patent Application Publication No. 318,216. Clones 
were selected by plaque hybridization. 

Nucleotide sequence from 2552 to 8700 was determined from consensus sequence of four obtained 
cDNA clones 02 (nt699B to 8700). 06(nt6485 to 8700). 08(nt6OO8 to 8700) and 081 (nt2199 to 6168) as 
75 shown in Figure 1 . Clones 081 and 08 were found to have nucleotide sequences shown in sequence lists 3 
and 4 respectively. 

(7) Determination of 3' terminal nucleotide sequence of HC-J6 strain. 

20 As shown in Figure 5. the nucleotide sequence of the 3* terminus of HC-J6 genome was determined by 

analysis of clones obtained by amplification of HCV specific cDNA by PCR. 

Nucleotide sequence of HC-J6 from nt8701 to 9241 was determined from consensus sequence of three 

clones consisting of 938 nucleotides, C9760. C9234 and C9761 , obtained by amplification of sample using 

primer #80 (5*-GACACCCGCTGTTTTGACTC-3') and #60 (5*-GTTCTTACTGCCCAGrTGAA-3'). 
25 Nucleotide sequence of 3' terminus down stream from nt9242 was determined in the method described 

below. 

Isolation of RNA from HC-J6 was made in the same manner as described in (1) above. The obtained 
RNA was added poly (A) to its 3' terminus using poly (A) polymerase and cDNA was synthesized using 
oligo (dT)2o as a primer, and obtained cDNA was provided to PCR as a template. 

30 Rrst PCR product was made with using #97 (5'-AGTCAGGGCGTCCCTCATCT-3*) as a sense primer 
and oligo (dT)2o as an antisense primer. Second PCR product was made with using #90 (5*- 
GCCGTTTGCGGCCGATATCT-3') corresponding to downstream sequence of #97 as a sense primer, and 
oligo (dT)2o as an antisense primer as well as first PCR product. PCR product obtained by two step 
amplification was smoothened on both ends by treatment with T^DNA polymerase, followed by 

35 phosphorylation of 5'terminus by T+ polynucleotide kinase. The obtained product was subcloned into Hinc 11 
position of M13mp19 phage vector. 

Nucleotide sequence of 3' terminus was determined from consensus sequence of 19 obtained clones, 
C10311. C10313. C10314, C10320. C10322. C10323. C10326, C10328, C10330, C10333, C10334, C10336, 
C10337. CI 0345. CI 0346. C10347, CI 0349, CI 0350 and CI 0357. 

40 As a result, the nucleotide sequence of cDNA to HC-J6 genome RNA was determined as shown in 
sequence list 2, and full sequence of genome RNA was determined as shown in sequence list 1. 

(8) Determination of amino acid sequences. 

46 According to the nucleotide sequence of the genome of strain HC-J6. determination was made of 
sequence of coded region starting with ATG. As a result, HC-J6 genome was found to have a long Open 
Reading Frame coding for polypeptide precursor consisting of 3033 amino acid residues. 

(9) D etermination of 5' terminal nucleotide sequence of HC-J8 

50 

As shown in Figure 6. the nucleotide sequence of 5' terminus of HC-J8 genome (a region) was 
determined by analysis of clones obtained by amplification of HCV specific cDNA by PCR. 

Single-stranded cDNA was synthesized using antisense primer #36 (5'-AACACTACTCGGCTAGCAGT- 
3') of nt246 to 265 in the same manner as (2) above, then it was added with dATP tail at its 3' terminus by 
55 terminal deoxy nucleotidyl transferase, then amplified by one-sided PCR in two stages. 

That is. in th first stag . antisense prim r #48 (5'-GTTGATCCAAGAAAGGACCC-3') of ntl 88 to 207 
was used with s nse primer selected from non-specific primer #165 (5*-AAGGATCCGTCGACATCGATAAT- 
ACG (A) 17-3') and #171 (5'-AAGGATCCGTCGACATCGATAATACG(T)i7-3') to amplify the dA-tailed cDNA 
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by PGR for 35 cycl s; and in the second stage, using the product of the first-stage PGR as a tennplate. non- 
specific primer #166 (5' AAGGATGGGTCGACATCGAT -3') and antisense prim r #109 (21 -m r, 5*-ACCG- 
GATCCGGAGAGCAGTAT -3*) w r added to initiat PGR for 30 cycl s. The product of PGR was sutjcloned 
to M13 phage vector. 

5 Thirteen independent clones (poly dT-tailed: G1 4951 .CI 4952. C14953. CI 4958. G14960, CI 4968. 
C14971. CI 4972 and CI 4974; poly dA-tailed: C14987. G14996. CI 4999 and CI 5000) were obtained (each 
considered having complete length of 5* terminus), and the consensus sequence of ntl-139 of the 
respective clones was determined. 

10 (10) cDNA amplification of ORF region and 3' terminus by PGR 

As shown in Figure 6. the nucleotide sequence of downstream from nt140 of HC-J8 genome was 
determined by analysis of clones obtained by amplification of HGV specific cDNA by PGR. 

Single-stranded cDNAs to HG-J8 RNA were synthesized in the same manner as (2) above using 
15 antisense primers described below, then they were amplified by PGR using sense and antisense primers 
described below. Each product of PGR was subcloned to Ml 3 phage vector, then consensus sequence of 
the respective clones of each region was determined. 

The primers for cDNA synthesis and PGR amplification, and the numbers of obtained clones are shown 
below for each region. Alphabetical symbol of each amplified region corresponds to that in Figure 6. 
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b region 
nt45-847 

Primer for cDNA synthesis: #122 (5 ' -AGGTTCCCTGTTGCATAATT-3 • ) 
Primer for PGR: sense: »32A (5 ' -CTGTGAGGAACTACTGTCTT-3 ' ) 

antisense #122 
Clones: C15221,C15222,C15223 

c region 
nt732-1354 

Primmer for cDNA synthesis: #54 (5 ' -ATCGCGTACGCCAGGATCAT-3 ' ) 
Primer for PGR: sense: #50 ( 5 ' -GCCGATCTCATGGGGTACAT-3 ' ) 

antisense: #54 
Clones: C15256 , C15257 ,C15258 

d region 
ntl300-1879 

Primer for cDNA synthesis: #199 ( 5 ' -GGGGTGAAACAATACACCGG-3 ' ) 
Primer for PGR: sense: #205 (5* -GGGACATGATGATGAACTGG-3* ) 

antisense: #199 
Clones: G14221,G14222,C14223 
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e region 
ntl833-2518 

Primer for cDNA synthesis: #146 ( 5 ' -AGTAGCATCATCCACAAGCA-3 ' ) 
Primer for PGR: sense: #150 ( 5 ' -ATCGTCTCGGCTAAGACGGT-3 • ) 

antisense: #146 
Clones: C11535,C11540,C11566 

f region 
nt2433-3451 

Primer for cDNA synthesis: #170 ( 5 ' -GCATAAGCAGTGATGGGGGC-3 ' ) 
Primer for PGR: sense: #160 (5 ' -CAGAACATCGTGGAGGTGCA-3 • ) 

antisense: #170 
Clones: C15348,C15349,C15356 

g region 
nt3404-4300 

Primer for cDNA synthesis: #225 ( 5 ' -TCGCATATGATGATGTCATA-3 * ) 
Primer for PGR: sense: #238 ( 5 ' -GTAGAGGTGCAAGGGGTGGA-3 • ) 

antisense: #225 
Clones: G15701 ,G15702 ,C15703 

h region 
nt4221-5015 

Primer for cDNA synthesis: #216 { 5 ' -GTGGTGTAGAGATAGGGGGA-3 ' ) 

Primer for PGR: sense: #230 ( 5 ' -GCCATGACGTAGTCGAGATA-3 • > 

antisense: #216 
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Clones: C15391,C15392,C15393 

i region 
nt4695-5062 

Primer for cDNA synthesis: #210 (5 ' -GCATCTATGTGTGTGAGGCC-3 ' ) 
Primer for PGR: sense: #209 ( 5 ' -TTCGACTCCGTGATCGACTG-3 ' ) 

antisense: #210 
Clones: C14087 , C14088 ,C14089 

i region 
nt5021-6169 

Primer for cDNA synthesis: #162 ( 5 ' -TCCGACTCCGTCACGTAGTG-3 * ) 
Primer for PCR: sense: #227 (5 ' -GTTCTGGGAAGCGGTCTTTA-3 • ) 

antisense: #162 
Clones: C15421 , C15422 , C15423 

k region 
nt6027-6889 

Primer for cDNA synthesis: #232 (5 ' -GATGGGTCTGTTAGCATGGA-3 ' ) 
Primer for PCR: sense: #242 (5 " -TTGGTAGTGGGAGTCATCTG-3 * ) 

antisense: #232 
Clones: C15733,C15734,C15735 

1 region 
nt6834-7735 

Primer for cDNA synthesis #239 ( 5 • -ATCGGTAACTTCTCCTCTTC-3 ' ) 
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Primer for PGR: sense: #241 ( 5 ' -CCTTGCGATCCTGAACCTGA-3 ' ) 
antisense: S239 

5 

Clones: C15798,C15799,C15800 



10 a region 

nt7656-8630 

Primer for cDNA synthesis: #222 { 5 • -GACCAGGTCGTCTCCACACA-3 * ) 

75 

Primer for PGR: sense: #229 (5' -GTCGTGTGCTGCTGCATGTC-3 ' ) 

antisense: #222 
20 Clones: C15376 ,C15378 ,C15381 



n region 

25 

nt8325-9511 

Primer for cDNA synthesis: #165 
^ Primer for PGR: sense: #80 ( 5 ' -GACACCGGCTGTTTTGAGTC-3 ' ) 

non-specific: #165 
Clones: C15270 ,C15271 ,C15272 

35 

From the analysis described above, full nucleotide sequence of cDNA to HC-J8 was determined as 
shown in sequence list 7, then full nucleotide sequence of HC-J8 genome RNA as shown in sequence list 6. 
Two amino acid sequences shown in sequence lists 8 and 9 represent those coded for by HC-J8 genome. 

40 Utilizing known immunological techniques, it is possible to determine epitopes (e.g., from the core 
region, etc.) from the polypeptides of sequence lists 5. 8 and 9. Determination of such epitopes of the 
NANB hepatitis virus opens access to chemical synthesis of the peptide, manufacturing of the peptide by 
genetic engineering techniques, synthesis of the polynucleotides, manufacturing of the antibody, manufac- 
turing of NANB hepatitis diagnostic reagents, and development of products such as NANB hepatitis 

45 vaccines. 

According to the well-known method described by Merrifield. NAMB peptides can be synthesized. 
Furthermore, the polynucleotides in sequence lists 2-4 and 7 can be used to express polypeptides in host 
cells such as Escherichia coli by means of genetic engineering technique. 

A detection system for antibody against NANB hepatitis virus can be developed using polyvinyl 

50 microtiter plates and the sandwich method. For example. 50ul of 5 tig/ml concentration of a NANB peptide 
can be dispensed in each well of the microtiter plates and incubated overnight at room temperature for 
consolidation. Th microplate w lis can be washed five times with physiological saline containing 0.05% 
Tween 20. For overcoating, 100 ul of NaCI buffer containing 30% (v/v) of calf serum and 0.05% Tw en 20 
(OS buffer) can b dispens d in each well and discarded after incubation for 30 minutes at room 

55 temperature. 

For detennination of NANB antibodies in sampi s, in the primary reaction, 50ul of the OS buff r 
containing 30% calf serum and 10 ul of a sample can b dispensed in each microplate well and incubated 
on a microplate vibrator for one hour at room temperature. After completion of the reaction, microplate wells 
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can be washed five times in th sam way as pr viously described. 

In the secondary r action, as label d antibody 1 ng of hers radish peroxidas lab led anti-human IgG 
mous monoclonaJ antibodi s (Fab* fragment: 22G, Institut of Immunology Co,. Ltd., Tokyo. Japan) 
dissolved in 50 ul of calf serum can be dispensed in each microplate well, and Incubated on a microplate 
5 vibrator for one hour at room temperature. Wells can be washed five times in the same way. After addition 
of hydrogen peroxide (as substrate) and 50 ul of O-phenylendiamine solution (as color developer) in each 
well, and after incubation for 30 minutes at room temperature. 50 ul of 4M sulphuric acid can be dispensed 
in each well to stop further color development and for reading absortjance at 492 nm. 

The cut-off level of this assay system can be set by measuring a number of donor samples with nomnal 
10 serum ALT (alanine aminotransferase) value of 34 Karmen unit or below and which tested negative for anti- 
HCV. 

The present invention makes possible detection of NANB hepatitis virus infection which could not be 
detected by conventional determination methods, and provide NANB hepatitis detection kits capable of 
highly specific and sensitive detection at an early phase of infection. 
75 These features allow accurate diagnosis of patients at an early stage of the disease and also help to 
remove at higher rate NANB hepatitis virus carrier bloods through screening test of donor bloods. 

Polypeptides and their antibodies under this invention can be utilized for manufacture of vaccines and 
immunological pharmaceuticals, and structural gene of NANB hepatitis virus provides indispensable tools 
for detection of polypeptide antigens and antibodies. 
20 Antigen-antiljody complexes can be detected by methods known in this art. Specific monoclonal and 
polyclonal antibodies can be obtained by immunizing such animals as mice, guinea pigs, rabbits, goats and 
horses with NANB peptides (e.g.. bearing NANB hepatitis antigenic epitope). 

The present invention is based on studies on isolated virus genome of NANB hepatitis virus named HC- 
J6 and HC-J8, and is completed by clarification of the full sequence of the nucleotides. The invention 
25 makes possible highly specific detection of NANB hepatitis virus and provision of polypeptide, polyclonal 
antibody and monoclonal antibody to prepare the test system. 

Further variations and modifications of the invention will become apparent to those skilled in the art 
from the foregoing and are intended to be encompassed by the claims appended hereto. 

Japanese Priority Applications 287402/91 filed August 9. 1991 and 360441/91 filed on December 5. 
30 1991 are relied on and incorporated by reference. U.S. patent applications serial no. 07/540,604 (filed June 
19, 1990), 07/653,090 (filed February 8, 1991) , and 07/712.875 (filed June 11, 1991) are incorporated by 
reference in their entirety. 
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Sequence list 

Sequence list 1 : 
Sequence list 2: 
Sequence list 3: 
Sequence list 4: 
Sequence list 5: 
Sequence list 6: 
Sequence list 7: 
Sequence list 8: 
Sequence list 9: 

Claims 



whole nucleotides of HC-J6 genome RNA 

N-9589 whole nucleotides of cDNA to HC-J6 genome RNA 

J6-081 nucleotides of clone J6-081 

J6-08 nucleotides of clone J6-08 

P-J6-3033 whole amino acids of ORF of HC-J6 genome 

whole nucleotides of HC-J8 genome RNA 

whole nucleotides of cDNA to HC-J8 genome RNA 

whole amino acids of a variation of ORF of HC-J8 genome 

whole amino acids of a variation of ORF of HC-J8 genome 



1, Recombinant RNA of non-A, non-B hepatitis virus, strain HC-J6, comprising the nucleotide sequence of 
sequence list 1 . 

50 

2. Recombinant cDNA of non-A, non-B hepatitis virus, strain HC-J6. comprising the nucleotide sequence 
of sequence list 2. 

3- cDNA clone J6-081 comprising the nucleotide sequence of sequence list 3. 

55 

4. cDNA clone J6-08 comprising the nucleotide sequence of sequence list 4. 
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5. Amino acid sequence corresponding to recombinant cDNA of non-A, non-B hepatitis virus, strain HC- 
J6, comprising the amino acid s quenc of sequence list 5. 

6- Recombinant RNA of non-A, non-B hepatitis virus, strain HC-J8, comprising the nucleotide sequence of 
5 sequence list 6. 

7. Recombinant cDNA of non-A, non-B hepatitis virus, strain HC-J8. comprising the nucleotide sequence 
of sequence list 7. 

70 flL Amino acid sequence corresponding to recombinant cDNA of non-A, non-B hepatitis virus, strain HC- 
J8, comprising the amino acid sequence of sequence list 8. 

9. Amino acid sequence conresponding to recombinant cDNA of non-A. non-B hepatitis virus, strain HC- 
J8, comprising the amino acid sequence of sequence list 9. 

75 

10. A non-A, non-B hepatitis diagnostic test kit for analyzing samples for the presence of antibodies 
directed against a non-A, non-B hepatitis antigen, comprising an antigen attached to a solid substrate 
and labeled anti-human immunoglobulin; wherein said antigen is an antigen selected from the antigens 
contained in sequence lists 5, 8 or 9. 

20 

11. A method of detecting antibodies directed against a non-A, non-B hepatitis antigen in a sample, said 
method comprising: 

(a) reacting said sample with an antigen selected from the antigens contained in sequence lists 5, 8 
or 9 to form antigen-antibody complexes; and 
25 (b) detecting said antigen-antibody complexes. 

12. A non-A, non-B hepatitis specific monoclonal or polyclonal antibody reactive with an antigen, said 
antigen is an antigen selected from the antigens contained in sequence lists 5, 8 or 9. 

30 13. A method of detecting non-A, non-B hepatitis antigen in a sample, said method comprising: 

(a) reacting said sample with the non-A, non-B hepatitis monoclonal or polyclonal antibody according 
to claim 12 to form antigen-antibody complexes; and 

(b) detecting said antigen-antibody complexes. 
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Sequence ID No, 1 

Sequence Length: 9,589 

Sequence Type: nucleic acid 

Strandedness: single 

Topology: linear 

Molecule Type: genomic RNA 

Method for Determination of Feature: E 



ACCCGCCCCU 


AAUAGGGGCG 


ACACUCCGCC 


AUGAACCACU 


CCCCUGUGAG 


GAACUACUGU 


60 


CUUCACGCAG 


AAAGCGUCUA 


GCCAUGGCGU 


UAGUAUGAGU 


GUCGUACAGC 


CUCCAGGCCC 


120 


CCCCCUCCCG 


GGAGAGCCAU 


AGUGGUCUGC 


GGAACCGGUG 


AGUACACCGG 


AAUUGCCGGG 


180 


AAGACUGGGU 


CCUUUCUUGG 


AUAAACCCAC 


UCUAUGCCCG 


GUCAUUUGGG 


CGUGCCCCCG 


240 


CAAGACUGCU 


AGCCGA6UAG 


CGUUGGGUUG 


C6AAAGGCCU 


UGUGGUACUG 


CCUGAUAGGG 


300 


UGCUUGCGAG 


UGCCCCGGGA 


GGUCUCGUAG 


ACC6UGCACC 


AUGAGCACAA 


AUCCUAAACC 


360 


aCAAAGAAAA 


ACCAAAAGAA 


ACACCAACCG 


UCGCCCACAA 


GACGUUAAGU 


UUCCGGGCGG 


420 


CGGCCAGAUC 


GUUGGCGGAG 


UAUACUUGUU 


GCCGCGCAGG 


GGCCCCAGGU 


UGGGUGUGCG 


480 


CGCGACAAGG 


AAGACUUCGG 


AGCGGUCCCA 


GCCACGUGGA 


AGGCGCCAGC 


CCAUCCCUAA 


540 


GGAUCGGCGC 


UCCACUGGCA 


AAUCCUGGGG 


AAAACCAGGA 


UACCCCUGGC 


CCCUAUACGG 


600 


GAAUGAGGGA 


CUCGGCUGGG 


CAGGAUGGCU 


CCUGUCCCCC 


CGAGGUUCCC 


GUCCCUCUUG 


660 


GGGCCCCAAU 


GACCCCCGGC 


AUAGGUCCCG 


CAACGUGGGU 


AAGGUCAUCG 


AUACCCUAAC 


720 


GUGCGQCUUU 


GCCGACCUCA 


UGGGGUACAU 


CCCUGUCGUA 


GGCGCCCCGC 


UCGGCGGCGU 


780 


CGCCAGAGCU 


CUCGCGCAUG 


GCGUGAGAGU 


CCUGGAGGAC 


GGGGUUAAliU 


UUGCAACAGG 


840 


GAACUUACCC 


GGUUGCUCCU 


UUUCUAUCUIJ 


CUUGCUGGCC 


CUGCUGUCCU 


GCAUCACCAC 


900 


CCCGGUCUCC 


GCUGCCGAAG 


UGAAGAACAU 


CAGUACCGGC 


UACAUGGUGA 


CCAACGACUG 


960 


CACCAAUGAU 


AGCAUUACCU 


GGCAACUCCA 


GGCUGCUGUC 


CUCCACGUCC 


CCGGGUGCGU 


1020 


CCCGUGCGAG 


AAAGUGGGGA 


AUACAUCUCG 


GUGCUGGAUA 


CCGGUCUCAC 


CGAAUGUGGC 


1080 


CGUGCAGCAG 


CCCGGCGCCC 


UCACGCAGGG 


CUUACGGACG 


CACAUUGACA 


UGGUUGUGAU 


1140 
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GUCCGCCACG 


CUCUGCUCCG 


CUCUUUACGU 


GGGGGACCUC 


UGCGGUGGGG 


UGAUGCUUGC 


1200 


AGCCCAGAUG 


UUCAUUGUCU 


CGCCACAGCA 


CCACUGGUUU 


GUGCAAGACU 


GCAAUUGCUC 


1260 


CAUCUACCCU 


GGUACCAUCA 


CUGGACACCG 


CAUGGCGUGG 


GACAU6AUGA 


UGAACUGGUC 


1320 


GCCCAC66CII 


ACCAUGAUCC 


UGGCGUACGC 


GAliGCGCGUC 


CCCGAGGUCA 


UCAUAGACAU 


1380 


CAUUGGCGGG 


GCUCAUUGGG 


GCGUCAUGUli 


CGGCUUAGCC 


UACUUCliCUA 


UGCAGGGAGC 


1440 


GUGGGCAAAA 


GUCGUUGUCA 


UUCUUUUGCU 


GGCCGCCGGG 


GUGGACGCGC 


AAACCCAUAC 


1500 


CGUUGGGGGU 


UCUACCGCGC 


AUAACGCCAG 


GACCCUCACC 


GGCAUGUUCU 


CCCUUGGUGC 


1560 


CAGGCAGAAA 


AUCCAGCUCA 


UCAACACCAA 


UGGCAGliUGG 


CACAUCAACC 


GCACC6CCCU 


1620 


GAACUGCAAU 


GACUCUUUGC 


ACACCGGCUU 


CCUC6CGUCA 


CUGUUCUACA 


CCCACAGCUU 


1680 


CAACUCGUCA 


GGAUGUCCCG 


AACGCAUGUC 


CGCCUGCCGC 


AGUAUCGAGG 


CCUUUCGGGU 


1740 


GGGAUGGGGC 


GCCUUACAAU 


AUGAGGACAA 


UGUCACCAAU 


CCAGAGGAUA 


UGAGACCGUA 


1800 


UUGCUGGCAC 


UACCCACCAA 


GACAGUGUGG 


UGUAGUCUCC 


GCGAGCUCUG 


UGUGUGGCCC 


1860 


AGUGUACUGU 


UUCACCCCCA 


GCCCAGUA6U 


AGUGGGUACG 


ACCGAUAGAC 


UUGGAGCGCC 


1920 


CACUUACAC6 


UGGGGGGAGA 


AUGAGACAGA 


UGUCUUCCUA 


[jUGAACAGCA 


CUCGACCACC 


1980 


GCAGGGGUCA 


UGGUUCGGCU 


GCACGUGGAU 


GAACUCCACU 


GGCUACACCA 


AGACUUGCGG 


2040 


CGCACCACCC 


UGCCGCAUUA 


GAGCUGACUU 


CAAUGCCAGC 


AUGGACUUGU 


UGUGCCCCAC 


2100 


GGACUGliUUU 


AGGAAGCAUC 


CUGAUACCAC 


CUACAUCAAA 


UGUGGCUCUG 


GGCCCUGGCU 


2160 


CACGCCAAG6 


UGCCUGAUCG 


ACUACCCCUA 


CAGGCUCUGG 


CAUUACCCCU 


GCACAGUUAA 


2220 


CUAUACCAUC 


UUCAAAAUAA 


GGAUGUAUGU 


GGGGGGGGUC 


GAGCACAGGC 


UCACGGCUGC 


2280 


GUGCAAUUUC 


ACUCGUGGGG 


AUCGUUGCAA 


CUUGGAGGAC 


AGAGACAGAA 


GUCAACUGUC 


2340 


liCCUUUGCUG 


CACUCCACCA 


CGGAGUGGGC 


CAUUUUACCU 


UGCACUUACU 


CGGACCUGCC 


2400 


CGCCUUGUCG 


ACUGGUCUUC 


UCCACCUCCA 


CCAAAACAUC 


GUGGACGUGC 


AAUUCAUGUA 


2460 


UGGCCUAUCA 


CCUGCUCUCA 


CAAAAUACAU 


CGUCCGAUGG 


GAGUGGGUAG 


UACUCUUAUU 


2520 


CCUGCUCUUA 


GC6GACGCCA 


GGGUUUGCGC 


CUGCUUAUGG 


AUGCUCAUCU 


UGUUGGGCCA 


2580 


GGCCGAAGCA 


GCACUAGAGA 


AGUUGGUCGU 


CUIiGCACGCU 


GCGAGCGCAG 


CUAGCUGCAA 


2640 


UGGCUUCCUA 


UACUUUGUCA 


UCUUUUliCGU 


GGCUGCUUGG 


UACAUCAAGG 


GUCGGGUAGU 


2700 


CCCCIJUGGCU 


ACUUAUUCCC 


UCACUGGCCU 


AUGGUCCUUU 


GGCCUACUGC 


UCCUAGCA UU 


2760 


GCCCCAACAG 


GCUUAUGCUU 


AUGACGCAUC 


UGUACAUGGU 


CAGAUAGGAG 


CAGCUCUGUU 


2820 


GGUACUGAUC 


ACUCUCUUUA 


CACUCACCCC 


CGGGUAUAAG 


ACCCUUCUCA 


GCCGGUUUCU 


2880 
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GUGGUGGUUG UGCUAUCUUC UGACCCUGGC GGAAGCUAUG GUCCAGGAGU GGGCACCACC 2940 
UAUGCAGGUG CGCGGUGGCC GUQAUG6GAU CAUAUGGGCC GUCGCCAUAU UCUGCCCG6G 3000 
UGUGGUGUUU GACAUAACCA AGUGGCUCUU GGCGGUGCUU GG6CCUGCUU AUCUCCUAAA 3060 
AG6UGCUUU6 ACGCGUGUGC CGUACUUCGU CAGQQCUCAC GCUCUACUAA GGAUGUGCAC .3120 
CAU6GUAAGG CAUCUC6C6G GGGGUAG6UA CGUCCAGAUG GUGCUACUAG CCCUUGGCAG 3180 
GUGGACUGGC ACUUACAUCU AUGACCACCU CACCCCUAUG UCGGAUUGGG CUGCUAAUGG 3240 
CCUGCGGGAC UUGGCGGUCG CCGUGGAGCC UAUCAUCUUC AGUCCGAUG6 AGAAAAAAGU 3300 
CAUCGUCUGG GGAGCGGAGA CAGCUGCUUG CGG6GAUAUC UUACACGGAC UUCCC6UGUC 3360 
CGCCCGACUU GGCCGGGAGG UCCUCCUUGG CCCAGCUGAU GGCUAUACCU CCAAGGGGUG 3420 
GAGUCUUCUC GCCCCCAUCA CUGCUUAUGC CCAGCAGACA CGC6GCCUUU UGGGCACCAU 3480 
AGUGGU6AGC AUGACGGGGC 6CGACAAGAC AGAACAGGCC GGGGAGAUUC AGGUCCUGUC 3540 
CACG6UCACU CAGUCCUUCC UCGGAACAAC CAUCUCGGGG GUCUUAUGGA CUGUCUACCA 3600 
UGGAGCUGGC AACAAGACUC UAGCCGGCUC ACGGGGUCCG GUCACACAGA UGUACUCCA6 3660 
UGCUGAG6GG GACUUAGUGG GGUGGCCCAG CCCCCCCG6G ACCAAAUCUU UGGAGCCGU6 3720 
CACGU6UGGA GCGGUCGACC UAUACCUGGU CACGCGAAAC GCUGAUGUCA UCCCGGCUC6 3780 
AAGAC6CGGG GACAAGCGAG GAGCGCUACU CUCCCCGAGA CCUCUUUCCA CCUUGAAGGG 3840 
GUCCUCGG6G G6CCCGGUGC UCUGCCCCA6 AG6CCACGCU GUCGGGGUCU UCCGGGCAGC 3900 
CGUGU6CUCC CGGG6CGUGG CCAAGUCCAU AGAUUUUAUC CCCGUUGAGA CACUUGACAU 3960 
CGUCACUCGG UCCCCCACCU UUAGUGACAA CAGCACACCA CCUGCUGUGC CCCAAACUUA 4020 
IjCAGGUCGGG UACUUACAUG CCCCGACUGG UAGUGGAAAG AGCACCAAAG UCCCUGUCGC 4080 
GUAUGCCGCU CAGGGGUACA AAGUGCUAGU GCUUAAUCCC UCGGU6GCUG CCACCCUGGG 4140 
6UUUGGGGCG UACUUGUCCA AGGCACAUGG CAUCAAUCCC AACAUUAGGA CUGGGGUCAG 4200 
GACUGUGACG ACCGGGGCGC CCAUCACGUA CUCCACAUAU GGCAAAUUCC UCGCC6AU6G 4260 
GGGCUGC6CA GGCGGCGCCU AUGACAUCAU CAUAUGCGAU 6AAUGCCAUG CCGUGGACUC 4320 
UACCACCAUU CUCGGCAUCG GAACAGUCCU CGAUCAAGCA GAGACAGCCG GGGUCAGGCU 4380 
AACUGUACUG GCUACGGCUA CGCCCCCCGG GUCAGUGACA ACCCCCCACC CCAACAUAGA 4440 
G6AGGU6GCC CUCGGGCAGG AGGGUGAGAU CCCCUUCUAU GGGAGGGCGA UUCCCCUGUC 4500 
AUACAliCAAG GGAG6AAGAC ACUUGAUCUU CUGCCACUCA AAGAAAAAGU GUGACGAGCU 4560 
C6CGGCGGCC CUUCGGGGUA U6GGCUUGAA CGCAGU6GCA UACUACAGAG GGCUGGACGU 4620 
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CUCCGUAAUA CCAACUCAGG GAGACGUAGU 
GUUUACU6GA 6ACUUU6ACU CCGUGAUCGA 
CUUCAGCUUG GACCCCACAU UCACCAUAAC 
ACGUA6CCAG CGCCG66GCC GCACGGGCAG 
CACUGGUGAG CGAGCCUCAG GAAUGUUUGA 
AGGGGCCGCA UGGUAUGA6C UCACACCAGC 
CAACACACCU GGUUUGCCUG U6UGCCAAGA 
CGGCCUCACA CACAUA6AUG CCCACUUCCU 
CGCAUACUUA ACAGCCUACC A6GCUACAGU 
CU6GGACGUC AUGUGGAAGU GUUUGACUCG 
UCUCCU6UAC CGCUUGGGCU CUGUUACCAA 
AUACAUCGCC ACCUGCAUGC AAGCCGACCU 
UGGGGGGGUC UUGGCGGCCG UCGCCGCGUA 
CGGCCGCUUG CACGUUAACC AGCGAGCCGU 
G6CUUUUGAU GA6AUGGAGG AAU6UGCCUC 
GAUAGCCGAG AUGCUGAAGU CCAAGAUCCA 
UCAAGACAUA CAACCC6CUG UGCAGGCUUC 
ACACAUGU6G AACUUCAUCA GCGGCAUUCA 
GAACCCUGCU GUAGCUUCCA UGAUGGCAUU 
IIAGCACCACU AUCCUUCUCA ACAUUUUGGG 
CGCGGGGGCU ACCGGCUUCG UCGUCAGUGG 
CUUGGGUAAG GUGCU6GUGG ACAUCCUGGC 
CGUCGCAUUC AAGAUCAUGU CUGGCGAGAA 
GCCUGGAAUU CU6UCUCCG6 GUGCCCUGGU 
CCGACACGUG GGACCGGGGG AAGGCGCUGU 
UUCCAGAGGA AACCACGUCG CCCCCACCCA 
UGUGACCCAA CUACUUGGCU CCCUUACCAU 
GAUUACUGAA GACUGCCCCA UCCCAUGCAG 
GGUUUGCACC AUCCUAACAG ACUUUAAAAA 



GGUC6UCGCC ACCGACGCCC UCAUGACGG6 4680 
CU6CAACGUA GCGGUCACUC AAGUUGUAGA 4740 
CACACAGACU GUCCCUCAAG ACGCUGUCUC 4800 
GGGAA6ACUG GGUAUUUAUA GGUAUGUUUC. 4860 
CAGU6UAGUG CUCUGCGAGU GCUACGAUGC 4920 
GGAGACCACC GUCA6GCUCA GAGCAUAUUU 4980 
CCAUCUUGAG UUUUGGGAGG CAGUUUUCAC 5040 
UUCCCAAACA AAGCAAUCGG GGGAAAAUUU 5100 
GUGCGCUAGG GCCAAAGCCC CCCCCCCGUC 5160 
ACUCAAGCCC ACACUCGUGG GCCCCACACC 5220 
CGAGGUCACC CUCACGCAUC CUGUGACGAA 5280 
UGAGGUCAUG ACCAGCAC6U GGGUCUUA6C 5340 
CUGCCUGGCG ACCGGGUGUG UUUGCAUCAU 5400 
CGUUGCACCG GACAAGGA6G UCCUCUAUGA 5460 
UAGAGCG6CU CUCAUUGAAG AGGGGCAGCG 5520 
AGGCUUAUUG CAGCAAGCUU CCAAACAAGC 5580 
UUGGCCCAAG GUAGAGCAAU UCUGGGCCAA 5640 
AUACCUCGCA GGACUAUCAA CACUGCCAGG 5700 
CAGUGCCGCC CUCACCAGUC CGUUGUCAAC 5760 
GGGCUGGCUA GCAUCCCAAA UUGCGCCUCC 5820 
CCUGGUGGGG GCU6CCGUAG GCAGCAUAGG 5880 
AGG6UAUGGU GCGGGCAUUU CGGGGGCUCU 5940 
GCCCUCCAUG GAGGAUGUUG UCAACCUGCU 6000 
GGUGGGAGUC AUCUGCGCGG CCAUCCUGCG 6060 
CCAAUGGAUG AAUAGGCUCA UUGCCUUUGC 6120 
CUACGUGACG GAGUCGGAUG CGUCGCA6CG 6180 
AACCAGCCUG CUCAGGAGAC UCCACAACUG 6240 
CGGCUCGU6G CUCCGCGAUG UGUGGGAUUG 6300 
CUGGCUGACC UCCAAAUUGU UCCCAAAGAU 6360 
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firniftftiinir 


rrpiiiiiiAiiriJ 


CUUGUCAAAA 


GGGGUACAAG 


GGCGUGUGGG 


CUGGCACUGG 


6420 


iiAiiOAiiftArr 


ArArftfiiifiiir 

nV/r\V/Uu UU Uv 


OUUGCGGCGC 


CAAUAUCUCU 

v/ntMi n u V u v w 


GGCAAUGUCC 


GCCUGGGCUC 


6480 


r/iiif^AnAAiiii 

V/HUUHUMHUU 


nV/UUUUV/V/Un 


AAAmifirAii 


GAAUAUCIJGG 


CAGGGGACCU 


UUCCCAUCAA 


6S40 


iiiif^iiiiArflrft 

UUUUUMl/HUu 


UnUUuV/trHuU 


UV/U UUV/vrUMn 


ArOCGCACCA 

ri V/ V/\/ ui/ n V v/n 


AACUUUAAGA 


UCGCCAUCUG 


6600 


UnUUUUUUV/U 


UV/lrUvounUU 


ArfirfifiAfifiii 

n V/ U \/ U Url V2U u 


GArGCAGCAC 


GGGUCAUACC 


ACUACAUAAC 


6660 


HuUMV/UUnV/v 


AnifiAiiAAni 


iiftAAAfiiiiirr 


lIllGrrAACUA 

VlUVJV/V/nnv un 


CCUUCliCCAG 


AGUliCUUUUC 

nviu wv/u Kiu UV/ 


6720 


UUuuuUUuHU 


^f^A^^linrAf^A 


iirrAiiAftftiiii 


iiGrrrrrAiiA 

U Vi V/ V/ V/ V/ V/ A U n 


rrCAAGCCGU 

V/ V/ Vl flrt U V/ V vi U 


UUIJUUCGGGA 


6780 


UUAuUUV/Ul/U 


IIIIPflftPAIIIIft 
UUV/UUV/UUUU 


ftftniiiAAiiiir 


AiiiiiiGiirGiir 


GGGUCUCAGC 

UliVl UV/ U vrAWIV/ 


UCCCIIUGCGA 


6840 




^APArA^^APfi 


iiAiiiiftArftiir 

UnUUUHl/UUV/ 


TAIIGrilAArA 

V/nUUV/UftnV/n 


GAorrAiirrr 

U tt V/ v/\/ n U vV/ V/ 


AlIAlirACGGr 

rt U rt U V n V Vi \J V 


6Q00 


UuAUMl/UuV/H 




UUUV/MV/UUuU 


uUV/MV/V/V/V/V/Vl 


lirrGAGGPAA 

U V/ V/ VJ n Vi Vi V/n n 


GnimirAGr 

\i V/ U V/ V/ U V/n li V/ 




uAulrUAulrUM 


iirn^^rArcAii 

UV/uUUAl/UAU 


l/UU UUV/UAUV/ 


V/ft vrV/ uvaV/MVrfV/ 


ArrPACGGrA 

rt V/ V/ V/H V/ V3 VJ V/n 


AGGrniAIIGA 

H V] V} V/ V/ U M U UM 


7090 


UuUuUAUAUii 


uUuuAUUUUA 


AOlrUuUUV/AU 


VJ U Vl Vi V2 Vl V/ U H U 


Vi U Vl H V/ V/ V/ VJ V2 n 


IIAGAGiirilGA 

UH VJM Vi U V/ U UH 


70A0 


uUlrUAAAuUU 


Aiioniiiiriirin 

uUUuUUl/Uuu 


HLrUlrUV/UV/UH 


V/ V/ V/n M U VJ u U V/ 


GAAGAAAGGA 


GrGAmillGA 

VJ V/ uM V/ V/ U U VJrV 


7140 


^imiiirf^AiiA 

UOUUUUUMUA 


PPAIir^^/^AAII 


AiiAiiftnirrr 


TAAGAAGAGA 

V/M M Vj nn Vj n Vin 


U U V/ 1/ V r\ V V/r\ U 


rriiiiAcrGGr 

V/V/ U V/MV/V/UUv 


7?00 


UUuuuV/Al/UU 


mi^AiiiiArA 


Arrf^Arr^Pfi 


IIGIIGGAAIirG 

UvlUUVlKnUV/Vi 


IIGGAAGAGGO 


TAGAIIIIArrA 


7?fiO 




uUUuV/uUUl/U 




V/ V/ l/i/ V/ V/ Un M VI 


AAAArrrPGA 

fi rt n V/ V/ V/ V/ U n 


ccxcuccccc 




A A^^^A^APf^r 


UUuAV/AuUuU 


uUlrUuMuUUH 


UrtVaV/ UV/V/rt l^n 


GPAGAiiGrrr 

VJV/rtVJrt U12V/V/V/ 


IIArAArAGPII 

UnV/nnV/nUV/U 


( oov/ 


UuUV/HUl/HHu 


UV/lrUUUUUVrV/ 




AAGPGGrGAII 

An uvr VJ Vi V/Vi M Lr 


iirAGGrniiiii 


rrACGGGGGC 

V/V/nV/UvlUUUv 


7440 




UMUUVyV/UUV/M 


U U\/U\Jnlf UV/V/ 


rrrrGAiiGAG 

V/V/V/V/VJrt UVinVi 


IIIJGGOCOIIIIU 

U UViuV/ vV/ U U U 


OGGAGACAGG 

V/UUoUnV/nUU 


7500 


iiiippAiinirp 

UUUv/AUUUV/l/ 


UV/OAUUi/lrl/U 


V/UV/UV/VlAV3ViU 


GGAGrniGGA 

ViV3HV3V/V/ Uvl Vin 


GAiirrAGAni 

VJrtUV/V^nVirtV/ u 


IIGGAGrniGA 

U Vi Un vi V/ V/ U Un 




nrAftriiiA/^A^ 




V/ V/ V/ V/V/ V/H V3V] u 


GGGGGIIGGIIA 


AAAAAAAAAM 


TAGGnirGGG 


76? 0 


A IIA 1 1 1 1 AA 1 lA 1 1 

GUCUUGGUCU 


A A IIIIAA IIA AA 

ACUUGCUCCG 


A /^A A AAA A A A 

AGGAGGACGA 


AIIAAAII/^A 

ClICCGUCGUG 


UGOUGLULLA 


UtiUCAUAUUO 


7C0A 


CUGGACCGGG 


GCUCUAAUAA 


CUCCUUGUAG 


CCCCGAAGAG 


GAAAAGUUGC 


CAAUUGGCCC 


7740 


CUUGAGCAAC 


UCCCUGUUGC 


GAUAUCACAA 


CAAGGUGUAC 


UGUACCACAU 


CAAAGAGCGC 


7800 


CUCAUUAAGG 


GCUAAAAAGG 


UAACUUUUGA 


UAGGAUGCAA 


GCGCUCGACG 


CUCAUUAU6A 


7860 


CUCAGUCUUG 


AAGGACAUUA 


AGCUAGCGGC 


CUCCAAGGUC 


ACCGCAAGGC 


UUCUCACUUU 


7920 


AGAGGAGGCC 


UGCCA6UUAA 


CUCCACCCCA 


CUCUGCAA6A 


UCCAAGUAUG 


GGUUUGGGGC 


7980 


UAAGGAGGUC 


CGCAGCUUGU 


CCQGGAGAGC 


CGUUAACCAC 


AUCAAGUCCG 


UGUGGAAG6A 


8040 


CCUCCU6GAA 


GACACACAAA 


CACCAAUUCC 


UACAACCAUC 


AUGGCCAAAA 


AUGAGGUGUU 


8100 
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CUGCGUGGAC 


CCCACCAAGG 


GGGGUAAGAA 


AGCAGCUCGC 


CUUAUCGUUU 


ACCCU6ACCU 


8160 


CGGCGUCAGG 


GUCUGCGAGA 


AAAUGGCCCU 


UUAUGAUAUC 


ACACAAAAGC 


UUCCUCAGGC 


8220 


GGUGAUGGGG 


GCUUCUUAUG 


GAUUCCAGUA 


CUCCCCC6CU 


CAGCGGGUGG 


AGUUUCUCUU 


8280 


GAAGGCAUGG 


GCGGAAAAGA 


AAGACCCUAU 


GGGUUUUUCG 


UAUGAUACCC 


GAUGCUUUGA 


8340 


CUCAACCGUC 


ACUGAGAGAG 


ACAUCAGGAC 


UGAGGAGUCC 


AUAUAUCGGG 


CUUGUUCCUU 


8400 


GCCCGAGGAG 


GCCCACACUG 


CCAUACACUC 


ACUGACUGAG 


AGACUUUACG 


UGGGAGGGCC 


8460 


CAUGUUCAAC 


AGCAAGGGCC 


AGACCUGCGG 


GUACAGGCGU 


UGCCGCGCCA 


GCGGGGUGCU 


8520 


UACCACUAGC 


AUGGGGAACA 


CCAUCACAUG 


CUAUGUGAAA 


GCCUUAGCGG 


CCUGUAAGGC 


8580 


UGCAGGGAUA 


AUUGCGCCCA 


CAAUGCUGGU 


AUGCGGCGAU 


GACUUGGUUG 


UCAUCUCAGA 


8640 


GAGCCAGGGG 


ACCGAGGAGG 


ACGAGCGGAA 


CCUGAGAGCC 


UUCACGGAGG 


CUAU6ACCAG 


8700 


GUAUUCUGCC 


CCUCCUGGUG 


ACCCCCCCAG 


ACCGGAAUAU 


GACCUGGAGC 


UGAUAACAUC 


8760 


UUGCUCCUCA 


AAUGUGUCUG 


UGGCGUUGGG 


CCCACAAG6C 


CGCCGCAGAU 


ACUACCUGAC 


8820 


CAGAGACCCU 


ACCACUCCAA 


UCGCCCGGGC 


UGCCUGGGAA 


ACAGUUAGAC 


ACUCCCCUGU 


8880 


CAAUUCAIiGG 


CUAGGAAACA 


UCAUCCAGUA 


CGCCCCAACC 


AUAUGGGCUC 


GCAUGGUCCU 


8940 


GAUGACACAC 


UUCUUCUCCA 


UUCUCAUGGC 


CCAAGAUACU 


CUGGACCAGA 


ACCUCAACUU 


9000 


UGAGAUGUAC 


GGAGCGGUGU 


ACUCC6UGAG 


UCCCUUGGAC 


CUCCCAGCCA 


UAAUUGAAAG 


9060 


GUUACACGGG 


CUUGACGCUU 


UCUCUCUGCA 


CACAUACACU 


CCCCACGAAC 


UGACACG6GU 


9120 


GGCUUCAGCC 


CUCAGAAAAC 


UUGGGGCGCC 


ACCCCUCA6A 


GCGUGGAAGA 


GCCGGGCACG 


9180 


UGCAGUCAGG 


GCGUCCCUCA 


UCUCCCGUGG 


GGGGAGAGCG 


GCCGUUUGCG 


6CCGAUAUCU 


9240 


CUUCAACUGG 


GCGGUGAAGA 


CCAAGCUCAA 


ACUCACUCCA 


UUGCCGGAAG 


CGCGCCUCCU 


9300 


GGAUUUAUCC 


AGCUGGUUCA 


CUGUCGGCGC 


CGGCGGGGGC 


GACAUUUAUC 


ACAGCGUGUC 


9360 


GCGUGCCCGA 


CCCCGCUUAU 


UACUCCUUGG 


CCUACUCCUA 


CUUUUUGUAG 


GGGUAGGCCU 


9420 


UUUCCUACUC 


CCCGCUCGGU 


AGAGCGGCAC 


ACAUUAGCUA 


CACUCCAUAG 


CUAACUGUCC 


9480 


CUUUUUUiiUU 


UUUUUUUUUU 


UUUUUUUUUU 


UUUUUUUUUU 


UUUUUUUUUU 


UUUUUUUUUU 


9540 


UUUUUUUUUU 


UUUUUUUUUU 


UUUUUUUUUU 


UUUUUUUUUU 


UUUUUUUUU 


9589 
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Sequence ID No.2 
Sequence Length: 9,589 
Sequence Type: nucleic acid 
Strandedness: single 
Topology: linear 

Molecule Type: cDNA to genomic RNA 
Method for Determination of Feature: E 



ACCCGCCCCT 


AATAGGGGCG 


ACACTCCGCC ATGAACCACT 


CCCCTGTGAG 


GAACTACTGT 


60 


CTTCACGCAG 


AAAGCGTCTA 


GCCATGGCGT TAGTATGAGT 


GTCGTACAGC 


CTCCAGGCCC 


120 


CCCCCTCCCG 


GGAGAGCCAT 


AGTGGTCTGC GGAACCGGTG 


AGTACACCGG 


AATTGCC6GG 


180 


AAGACTGGGT 


CCTTTCTTGG 


ATAAACCCAC TCTATGCCCG 


GTCATTTG6G 


CGTGCCCCCG 


240 


CAA6ACTGCT 


AGCCGAGTAG 


CGTTGGGTTG CGAAAGGCCT 


TGTG6TACT6 


CCTGATAGGG 


300 


T6CTTGC6AG 


TGCCCCGGGA 


GGTCTCGTAG ACCGTGCACC 


ATGAGCACAA 


ATCCTAAACC 


360 


TCAAAGAAAA 


ACCAAAAGAA 


ACACCAACCG TCGCCCACAA 


GACGTTAAGT 


nCCGGGCGG 


420 


CGGCCA6ATC 


GTTGGCGGAG 


TATACTTGTT GCCGCGCAGG 


GGCCCCAGGT 


TGGGTGTGCG 


480 


CGCGACAA6G 


AAGACTTCGG 


AGCGGTCCCA GCCACGTGGA 


AGGCGCCAGC 


CCATCCCTAA 


540 


GGATCGGCGC 


TCCACTGGCA 


AATCCTGGGG AAAACCAGGA 


TACCCCTGGC 


CCCTATACGG 


600 


GAATGAG66A 


CTCGGCTGGG 


CA6GATGGCT CCTGTCCCCC 


CGAGGTTCCC 


GTCCCTCTTG 


660 


G6GCCCCAAT 


GACCCCCGGC 


ATAGGTCCCG CAACGTGGGT 


AAGGTCATCG 


ATACCCTAAC 


720 


GTGCG6CTTT 


GCC6ACCTCA 


TGGGGTACAT CCCTGTCGTA 


GGCGCCCCGC 


TCGGCGGCGT 


780 


CGCCAGAGCT 


CTC6CGCAT6 


GCGTGA6AGT CCTGGAGGAC 


GGGGTTAATT 


TTGCAACAGG 


840 


GAACTTACCC 


GGTTGCTCCT 


TTTCTATCTT CTTGCTGGCC 


CTGCTGTCCT 


GCATCACCAC 


900 


CCCG6TCTCC 


GCTGCCGAAG 


TGAAGAACAT CAGTACCGGC 


TACATGGTGA 


CCAACGACTG 


960 


CACCAATGAT 


AGCATTACCT 


GGCAACTCCA GGCTGCTGTC 


CTCCACGTCC 


CCGGGTGCGT 


1020 


CCCGT6CGAG 


AAAGTGGGGA 


ATACATCTCG GTGCTGGATA 


CCGGTCTCAC 


CGAATGTGGC 


1080 


C6TGCAGCAG 


CCCGGCGCCC 


TCACGCAGG6 CTTACGGACG 


CACATTGACA 


TGGTTGTGAT 


1140 


GTCCGCCACG 


CTCTGCTCCG 


CTCTTTACGT GGGGGACCTC 


TGCGGTGGG6 


TGATGCTTGC 


1200 
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AGCCCAGATG 


TTCATTGTCT 


CGCCACAGCA 


CCACTGGTTT 


GTGCAAGACT 


AAA AXXAAXA 

GCAATTGCTC 


1260 


CATCTACCCT 


GGTACCATCA 


CTGGACACCG 


CATGGCGTGG 


GACATGATGA 


XA» * ATAATA 

TGAACT6GTC 


1320 


GCCCACGGCT 


ACCATGATCC 


TGGCGTACGC 


GATGCGCGTC 


CCCGAGGTCA 


xaaxaaaaax 

TCATAGACAT 


1380 


CATTGGCG6G 


GCTCATTGGG 


GCGTCATGTT 


CGGCTTAGCC 


T 4 /\XX/\X/\X A 

TACTTCTCTA 


XAA A AAAA A A 

TGCAGGGAGC. 


1440 


GTGGGCAAAA 


GTCGTTGTCA 


TTCTTTTGCT 


GGCCGCCGGG 


AXAAA AAAAA 

GTGGACGCGC 


A A AAAAAXAA 

AAACCCATAC 


1500 


CGTTG66GGT 


TCTACC6CGC 


ATAACGCCAG 


GACCCTCACC 


A/\A A X AXXAX 

GGCATGTTCT 


AA AXXAAXA A 

CCCTT6GTGC 


^ r A A 

1560 


CAGGCAGAAA 


ATCCA6CTCA 


TCAACACCAA 


TG6CAGTTGG 


f\ A ^ A XA A A 

CACATCAACC 


AA A A AAAA AX 

GCACCGCCCT 


1620 


GAACTGCAAT 


GACTCTTTGC 


ACACCGGCTT 


CCTCGCGTCA 


AXAXXAXAAA 

CTGTTCTACA 


AAA AAA AAXX 

CCCACAGCTT 


1680 


CAACTCGTCA 


GGATGTCCCG 


AACGCATGTC 


CGCCTGCCGC 


A y*T A "IT AA A AA 

AGTATCGAGG 


A ATT T A A A AT" 

CCTTTCGGGT 


1740 


GGGATGGGGC 


GCCTTACAAT 


ATGAGGACAA 


TGTCACCAAT 


AAA A A A A A T A 

CCAGAGGATA 


A AAAAA A T" A 

T6AGACCGTA 


1800 


TT6CT66CAC 


TACCCACCAA 


GACAGTGTG6 


TGTAGTCTCC 


AAA A AAT'AY'A 

GCGA6CTCTG 


TATA T A A A AA 

TGTGTGGCCC 


1860 


AGTGTACTGT 


TTCACCCCCA 


GCCCAGTAGT 


AGTGGGTACG 


A A A A A ^ A A A A 

ACCGATAGAC 


T" TT" A A A A A A A A 

TTGGAGCGCC 


1920 


CACTTACACG 


TGGGGGGAGA 


ATGAGACAGA 


TGTCTTCCTA 


TTGAACAGCA 


CTCGACCACC 


1980 


GCAGGGGTCA 


TGGTTCGGCT 


GCACGTG6AT 


GAACTCCACT 


GGCTACACCA 


AGACTTGCGG 


2040 


CGCACCACCC 


T6CCGCATTA 


GAGCTGACTT 


CAATGCCAGC 


A 1^ AAA ATT A T' 

ATGGACTTGT 


A ^ AAAAA A A 

TGTGCCCCAC 


2100 


GGACTGTTTT 


AGGAAGCATC 


CTGATACCAC 


CTACATCAAA 


TGTGGCTCTG 


GGCCCTGGCT 


2160 


CACGCCAAGG 


TGCCTGATCG 


ACTACCCCTA 


CA6GCTCTGG 


CATTACCCCT 


GCACAGTTAA 


2220 


CTATACCATC 


TTCAAAATAA 


GGATGTATGT 


GGGGGGGGTC 


GAGCACAGGC 


A A A A A A T* A A 

TCACGGCTGC 


2280 


GTGCAATTTC 


ACTCGTGGG6 


ATCGTTGCAA 


CTTGGAGGAC 


AAAAAAAAA A 

AGAGACAGAA 


ATA» aatata 

GTCAACTGTC 


2340 


TCCTTTGCTG 


CACTCCACCA 


CGGAGTGGGC 


CATTTTACCT 


TGCACTTACT 


A A A AAA ^r* AAA 

CGGACCTGCC 


2400 


CGCCTTGTCG 


ACTGGTCTTC 


TCCACCTCCA 


CCAAAACATC 


GTGGACGTGC 


A A ^T^P A A A ^ A 

AATTCAT6TA 


2460 


TGGCCTATCA 


CCTGCTCTCA 


CAAAATACAT 


CGTCCGAT6G 


GAGTGGGTAG 


TACTCTTATT 


2520 


CCTGCTCTTA 


GCGGACGCCA 


GGGTTTGCGC 


CTGCTTATGG 


ATGCTCATCT 


T6TTGGGCCA 


2S80 


GGCCGAAGCA 


GCACTA6AGA 


A6TTGGTCGT 


CTTGCACGCT 


GCGAGCGCAG 


CTAGCTGCAA 


2640 


TGGCTTCCTA 


TACTTTGTCA 


TCTTTTTCGT 


GGCTGCTTGG 


TACATCAAGG 


GTCGGGTAGT 


2700 


CCCCTT6GCT 


ACTTATTCCC 


TCACTGGCCT 


ATGGTCCTTT 


GGCCTACTGC 


TCCTAGCATT 


2760 


GCCCCAACAG 


GCTTATGCTT 


ATGACGCATC 


TGTACATGGT 


CAGATAGGAG 


CAGCTCTGTT 


2820 


GGTACTGATC 


ACTCTCTTTA 


CACTCACCCC 


CGGGTATAAG 


ACCCTTCTCA 


GCCGGTTTCT 


2880 


GTGGTG6TTG 


TGCTATCTTC 


TGACCCTGGC 


GGAAGCTATG 


GTCCAGGAGT 


GGGCACCACC 


2940 
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TATGCAGGTG 


CGCGGTGGCC 


GTGATGG6AT 


CATATGG6CC 


GTCGCCATAT 


TCTGCCCGGG 


3000 


TGTG6TGTTT 


GACATAACCA 


AGTGGCTCTT 


GGCGGTGCTT 


GGGCCTGCTT 


ATCTCCTAAA 


3060 


AGGT6CTTT6 


ACGCGTGTGC 


CGTACTTCGT 


CAGGGCTCAC 


GCTCTACTAA 


GGATGTGCAC 


3120 


CATGGTAA6G 


CATCTCGC6G 


GGGGTAGGTA 


CGTCCAGATG 


GTGCTACTAG 


CCCTTGGCA6 


3180 


GTGGACTGGC 


ACTTACATCT 


ATGACCACCT 


CACCCCTATG 


TCGGATTGGG 


CTGCTAATGG 


3240 


CCTGCGGGAC 


TTGGCG6TC6 


CCGTG6AGCC 


TATCATCTTC 


AGTCCGATGG 


AGAAAAAAGT 


3300 


CATCGTCTGG 


GGAGCGGAGA 


CAGCT6CTTG 


CGGGGATATC 


TTACAC6GAC 


TTCCC6TGTC 


3360 


CGCCCGACTT 


GGCCGGGAGG 


TCCTCCTTG6 


CCCA6CTGAT 


GGCTATACCT 


CCAA66GGTG 


3420 


GAGTCTTCTC 


GCCCCCATCA 


CTGCTTATGC 


CCAGCAGACA 


CGCGGCCTTT 


TGGGCACCAT 


3480 


AGTGGTGAGC 


ATGACGGGGC 


GCGACAAGAC 


AGAACA6GCC 


GGGGAGATTC 


AGGTCCTGTC 


3540 


CACGGTCACT 


CAGTCCTTCC 


TCGGAACAAC 


CATCTCGGGG 


GTCTTATGGA 


CTGTCTACCA 


3600 


TGGAGCTGGC 


AACAA6ACTC 


TAGCCGGCTC 


ACGG6GTCCG 


GTCACACAGA 


TGTACTCCAG 


3660 


TGCTGAGGGG 


GACTTAGTGG 


GGT6GCCCAG 


CCCCCCCGGG 


ACCAAATCTT 


TGGAGCCGTG 


3720 


CACGTGTGGA 


GCGGTCGACC 


TATACCTGGT 


CACGCGAAAC 


GCTGATGTCA 


TCCCGGCTCG 


3780 


AAGACGCGGG 


6ACAAGCGAG 


GAGCQCTACT 


CTCCCCGAGA 


CCTCTTTCCA 


CCTTGAAGGG 


3840 


GTCCTCGG6G 


GGCCCGGTGC 


TCTGCCCCAG 


AGGCCACGCT 


GTCGGGGTCT 


TCCGGGCAGC 


3900 


CGT6TGCTCC 


CGGGGCGTGG 


CCAAGTCCAT 


AGATTTTATC 


CCCGTTGAGA 


CACTTGACAT 


3960 


CGTCACTCGG 


TCCCCCACCT 


TTAGTGACAA 


CAGCACACCA 


CCTGCT6TGC 


CCCAAACTTA 


4020 


TCAGGTCGGG 


TACTTACATG 


CCCCGACTGG 


TAGTGGAAAG 


AGCACCAAAG 


TCCCTGTCGC 


4080 


GTAT6CCGCT 


CAGGGGTACA 


AAGTGCTAGT 


GCTTAATCCC 


TCGGTGGCTG 


CCACCCTGGG 


4140 


GTTTGGGGCQ 


TACTTGTCCA 


AGGCACATGG 


CATCAATCCC 


AACATTAGGA 


CTGGGGTCAG 


4200 


GACTGTGACG 


ACCGGGGCGC 


CCATCACGTA 


CTCCACATAT 


GGCAAATTCC 


TCGCCGATGG 


4260 


GGGCT6CGCA 


GGCGGCGCCT 


ATGACATCAT 


CATATGCGAT 


6AATGCCATG 


CCGTGGACTC 


4320 


TACCACCATT 


CTCGGCATCG 


GAACAGTCCT 


CGATCAAGCA 


GAGACAGCCG 


GGGTCAGGCT 


4380 


AACTGTACTG 


GCTACGGCTA 


CGCCCCCCGG 


GTCAGTGACA 


ACCCCCCACC 


CCAACATAGA 


4440 


GGAGGTGGCC 


CTCGGGCAGG 


AGGGTGAGAT 


CCCCTTCTAT 


GGGAGGGCGA 


TTCCCCTGTC 


4500 


ATACATCAAG 


GGAGGAAGAC 


ACTTGATCTT 


CTGCCACTCA 


AAGAAAAAGT 


6TGACGAGCT 


4560 


CGCGGC6GCC 


CTTCGGGGTA 


TGGGCTTGAA 


CGCAGTGGCA 


TACTACAGAG 


GGCTGGACGT 


4620 


CTCCGTAATA 


CCAACTCAGG 


GAGACGTAGT 


GGTCGTCGCC 


ACCGACGCCC 


TCATGACGGG 


4680 
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GTTTACTGGA 


GACTTTGACT 


CCGTGATCGA 


CT6CAACGTA 


GCGGTCACTC 


AAGTTGTAGA 


4740 


CTTCAGCTTG 


6ACCCCACAT 


TCACCATAAC 


CACACAGACT 


GTCCCTCAAG 


ACGCTGTCTC 


4800 


AC6TAGCCAG 


CGCC6GGGCC 


GCACGGGCAG 


G6GAAGACTG 


GGTATTTATA 


GGTATGTTTC 


4860 


CACT6GTGAG 


CGAGCCTCA6 


GAATGTTTGA 


CA6TGTA6T6 


CTCTGCGAGT 


GCTAC6ATGC 


4920 


AGGGGCCGCA 


TGGTATGAGC 


TCACACCAGC 


GGAGACCACC 


GTCAGGCTCA 


GAGCATATTT 


4980 


CAACACACCT 


GGTTTGCCTG 


TGTGCCAAGA 


CCATCTTGAG 


TTTTGGGAGG 


CA6TTTTCAC 


5040 


CG6CCTCACA 


CACATA6ATG 


CCCACTTCCT 


TTCCCAAACA 


AAGCAATCGG 


GGGAAAATTT 


5100 


CGCATACTTA 


ACAGCCTACC 


AGGCTACAGT 


GTGCGCTAGG 


6CCAAAGCCC 


CCCCCCCGTC 


5160 


CTG6GACGTC 


ATGT66AAGT 


GTTTGACTC6 


ACTCAAGCCC 


ACACTCGTGG 


GCCCCACACC 


5220 


TCTCCTGTAC 


CGCTTGGGCT 


CTGTTACCAA 


CGAGGTCACC 


CTCACGCATC 


CTGTGACGAA 


5280 


ATACATCGCC 


ACCTGCATGC 


AAGCCGACCT 


TGAGGTCATG 


ACCAGCACGT 


GGGTCTTAGC 


5340 


T6GGGGGGTC 


TTGGCGGCCG 


TCGCCGCGTA 


CTGCCTG6CG 


ACCGGGTGTG 


TTTGCATCAT 


5400 


CGGCCGCTTG 


CACGTTAACC 


AGCGAGCCGT 


CGTTGCACCG 


GACAAGGAGG 


TCCTCTATGA 


5460 


GGCTTTTGAT 


GAGATGQAGG 


AATGTGCCTC 


TAGAGCGGCT 


CTCATTGAAG 


AGGGGCAGCG 


5520 


GATAGCCGAG 


ATGCTGAAGT 


CCAAGATCCA 


AGGCTTATTG 


CAGCAAGCTT 


CCAAACAA6C 


5580 


TCAAGACATA 


CAACCCGCTG 


TGCAGGCTTC 


TTGGCCCAA6 


GTAGAGCAAT 


TCTGGGCCAA 


5640 


ACACATGTGG 


AACTTCATCA 


GCGGCATTCA 


ATACCTCGCA 


GGACTATCAA 


CACTGCCAGG 


5700 


GAACCCTGCT 


GTAGCTTCCA 


TGATGGCATT 


CAGTGCCGCC 


CTCACCAGTC 


CGTTGTCAAC 


5760 


TAGCACCACT 


ATCCTTCTCA 


ACATTTTGG6 


GGGCTGGCTA 


GCATCCCAAA 


TTGCGCCTCC 


5820 


CGCGGGGGCT 


ACCGGCTTCG 


TCGTCAGTGG 


CCTGQTGGGG 


GCTGCCGTAG 


GCAGCATAGG 


5880 


CrrGGGTAAG 


GTGCTG6TGG 


ACATCCTGGC 


AGGGTATGGT 


GCGGGCATTT 


CGGGGGCTCT 


5940 


CGTCGCATTC 


AAGATCATGT 


CT6GCGAGAA 


GCCCTCCATG 


GAGGATGTT6 


TCAACCTGCT 


6000 


GCCTGGAATT 


CTGTCTCCGG 


GTGCCCTGGT 


GGTGGGAGTC 


ATCTGCGCGG 


CCATCCTGC6 


6060 


CCGACAC6TG 


GGACCGGGGG 


AAGGCGCTGT 


CCAATGGATG 


AATAGGCTCA 


TTGCCTTTGC 


6120 


TTCCAGAGGA 


AACCACGTCG 


CCCCCACCCA 


CTACGTGACG 


GAGTCGGATG 


CGTCGCAGCG 


6180 


TGTGACCCAA 


CTACTTGGCT 


CCCTTACCAT 


AACCAGCCTG 


CTCAGGAGAC 


TCCACAACTG 


6240 


GATTACTGAA 


GACTGCCCCA 


TCCCATGCAG 


C6GCTCGTGG 


CTCCGCGATG 


TGTGGGATTG 


6300 


GGTTTGCACC 


ATCCTAACAG 


ACTTTAAAAA 


CT6GCTGACC 


TCCAAATTGT 


TCCCAAAGAT 


6360 


6CCTGGTCTC 


CCCTTTATCT 


CTTGTCAAAA 


GGGGTACAAG 


GGCGTGTGGG 


CTGGCACTGG 


6420 
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TATCATGACC ACACGGTGTC CTTGCGGCGC 
CATGAGAATT ACGG6GCCCA AAACCTGCAT 
TTGHACACG GAGG6CCAGT GCGT6CCGAA 
GAGGGTGGCG GCCTCAGAGT ACGCGGAGGT 
AGGACTTACC ACTGATAACT TGAAAGTTCC 
CTG6GTGGAC GGAGTGCAGA TCCATAGGTT 
TGAGGTCTCG TTCTGCGTTG GGCTTAATTC 
TCCTGAACCT GACACAGACG TATTGACGTC 
GGAGACTGCA GCGC6GCGTT TGGCACGGGG 
GAGCCAGCTA TCGGCACCAT CGCTGCGA6C 
TGTGGACATG GTGGATGCCA ACCTGTTCAT 
GTCCAAAGTG GTCGTTCTGG ACTCTCTCGA 
GCCTTCGATA CCATCGGAAT ATATGCTCCC 
TTGGGCACGG CCTGATTACA ACCCACCGCT 
ACCGGCCACT GTTGCGGGCT QCGCTCTCCC 
AAGGAGACGC CGGACAGT6G GTCTGAGTGA 
GGCCATCAAG TCCTTTGGCC AGCCCCCCCC 
GGACGCA6CC GATTCCGGCA GTCGGACGCC 
TTCCATCTCC TCCATGCCCC CTCTCGAGGG 
GCAGGTAGA6 CTTCAACCTC CCCCCCAGGG 
6TCTTGGTCT ACTTGCTCCG AGGAGGACGA 
CTGGACCG66 GCTCTAATAA CTCCTTGTAG 
CTTGAGCAAC TCCCTGTT6C GATATCACAA 
CTCATTAA66 6CTAAAAA6G TAACTTTTGA 
CTCAGTCTT6 AAGGACATTA AGCTAGCGGC 
AGAGGAGGCC TGCCAGTTAA CTCCACCCCA 
TAAGGAG6TC CGCAGCTTGT CCGGGAGAGC 
CCTCCTGGAA GACACACAAA CACCAATTCC 
CTGC6TGGAC CCCACCAAGG GGGGTAAGAA 



CAATATCTCT GGCAATGTCC GCCTGGGCTC 6480 
GAATATCTG6 CAGG6GACCT TTCCCATCAA 6540 
ACCC6CACCA AACTTTAAGA TCGCCATCTG 6600 
GACGCAGCAC GGGTCATACC ACTACATAAC 6660 
ITGCCAACTA CCTTCTCCAG AGTTCTTTTC 6720 
TGCCCCCATA CCGAAGCCGT TTTTTCGGGA 6780 
ATTTGTCGTC 6GGTCTCAGC TCCCTTGGGA 6840 
CATGCTAACA GACCCATCCC ATATCACGGC 6900 
6TCACCCCC6 TCC6AGGCAA GCTCCTCAGC 6960 
CACCTGCACC ACCCACGGCA AGGCCTATGA 7020 
66GGGGCGAT 6TGACCCGGA TAGAGTCTGA 7080 
CCCAATGGTC GAAGAAAGGA GCGACCTTGA 7140 
CAAGAAGAGA TTCCCACCAG CCTTACCGGC 7200 
TGTGGAATCG TGGAAGAGGC CAGATTACCA 7260 
CCCCCCTAAG AAAACCCCGA CGCCTCCCCC 7320 
GAGCTCCATA GCAGAT6CCC TACAACAGCT 7380 
AA6CGGCGAT TCAGGCCTTT CCACGGG6GC 7440 
CCCCGATGAG TTGGCCCTTT CGGAGACAGG 7500 
GGAGCCTGGA GATCCAGACT TGGAGCCTGA 7560 
GGGGGTGGTA ACCCCCGGCT CAGGCTC6GG 7620 
CTCCGTCGTG TGCTGCTCCA TGTCATACTC 7680 
CCCCGAAGAG GAAAAGTTGC CAATTGGCCC 7740 
CAAGGTGTAC TGTACCACAT CAAAGAGCGC 7800 
TA6GATGCAA GCGCTCGACG CTCATTATGA 7860 
CTCCAAGGTC ACCGCAAGGC TTCTCACTTT 7920 
CTCT6CAAGA TCCAAGTATG G6TTTGGGGC 7980 
CGTTAACCAC ATCAAGTCCG TGTGGAAGGA 8040 
TACAACCATC ATGGCCAAAA ATGAGGTGTT 8100 
AGCAGCTCGC CTTATCGTTT ACCCTGACCT 8160 
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CGGCGTCAGG 


GTCTGCGAGA 


AAATGGCCCT 


TTATGATATC 


ACACAAAAGC 


TTCCTCAGGC 


8220 


GGTGAT6GGG 


GCTTCTTATG 


GATTCCAGTA 


CTCCCCCGCT 


CAGCGGGTGG 


AGTTTCTCTT 


8280 


GAAGGCATGG 


GCGGAAAAGA 


AAGACCCTAT 


GG6TTTTTCG 


TATGATACCC 


GATGCTTTGA 


8340 


CTCAACCGTC 


ACTGAGAGAG 


ACATCAGGAC 


TGAGGAGTCC 


ATATATCGGG 


CTTGTTCCTT. 


8400 


GCCCGAGGAG 


GCCCACACTG 


CCATACACTC 


ACTGACTGAG 


A6ACTTTACG 


TGGGAGGGCC 


8460 


CATGTTCAAC 


AGCAAGGGCC 


AGACCTGCGG 


GTACAGGCGT 


TGCCGCGCCA 


6CGGGGTGCT 


8520 


TACCACTAGC 


ATGGGGAACA 


CCATCACAT6 


CTAT6TGAAA 


GCCTTAGCGG 


CCTGTAAGGC 


8580 


TGCA666ATA 


ATTGCGCCCA 


CAATGCTGGT 


ATGCGGCGAT 


GACTTGGTTG 


TCATCTCAGA 


8640 


GAGCCAGGGG 


ACCGAGGAGG 


ACGAGCGGAA 


CCTGAGAGCC 


TTCACGGAGG 


CTATGACCAG 


8700 


GTATTCTGCC 


CCTCCTGGT6 


ACCCCCCCA6 


ACCGGAATAT 


GACCTG6AGC 


TGATAACATC 


8760 


TTGCTCCTCA 


AATGTGTCTG 


TGGCGTTGGG 


CCCACAAGGC 


CGCCGCAGAT 


ACTACCTGAC 


8820 


CAGAGACCCT 


ACCACTCCAA 


TCGCCCGGGC 


TGCCTGGGAA 


ACAGTTAGAC 


ACTCCCCTGT 


8880 


CAATTCATGG 


CTAGGAAACA 


TCATCCAGTA 


CGCCCCAACC 


ATATGGGCTC 


GCAT6GTCCT 


8940 


GATGACACAC 


TTCTTCTCCA 


TTCTCAT6GC 


CCAAGATACT 


CTGGACCAGA 


ACCTCAACTT 


9000 


TGAGATGTAC 


GGAGCGGTGT 


ACTCCGTGAG 


TCCCTTGGAC 


CTCCCAGCCA 


TAATTGAAAG 


9060 


GTTACACGGG 


CTTGACGCTT 


TCTCTCTGCA 


CACATACACT 


CCCCACGAAC 


TGACACGGGT 


9120 


GGCTTCAGCC 


CTCAGAAAAC 


TTGGGGCGCC 


ACCCCTCAGA 


GCGTGGAAGA 


GCCGGGCACG 


9180 


TGCAGTCAGG 


GCGTCCCTCA 


TCTCCCGTGG 


GGGGAGAGCG 


GCCGTTTGCG 


GCCGATATCT 


9240 


CTTCAACTGG 


GCGGTGAAGA 


CCAAGCTCAA 


ACTCACTCCA 


TTGCCGGAAG 


CGCGCCTCCT 


9300 


GGATTTATCC 


AGCTGGTTCA 


CTGTCGGCGC 


CGGCGGGGGC 


GACATTTATC 


ACAGCGTGTC 


9360 


GCGTGCCCGA 


CCCCGCTTAT 


TACTCCTTGG 


CCTACTCCTA 


CTTTTTGTAG 


GGGTAGGCCT 


9420 


TTTCCTACTC 


CCCGCTCGGT 


AGAGCGGCAC 


ACATTAGCTA 


CACTCCATAG 


CTAACTGTCC 


9480 


CTTTTTTTTT 


TTTTTTTTTT 


TTTTTTTTTT 


TTTTTTTTTT 


TTTTTTTTTT 


TTTTTTTTTT 


9540 


TTTTTTTTTT 


TTTTTTTTTT 


TTTTTTTTTT 


TTTTTTTTTT 


TTTTTTTTT 


9589 
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Sequence ID No. 3 
Sequence Length: 3,970 
Sequence Type; nucleic acid 
Strandedness: single 
Topology: linear 

Molecule Type: cDNA to genomic RNA 
Method for Determination of Feature: E 



GGCATTACCC 


CTGCACAGTT 


AACTATACCA 


TCTTCAAAAT 


AAGGATGTAT 


GTGGGGGGGG 


60 


TCGAGCACAG 


GCTCACGGCT 


GCGTGCAATT 


TCACTCGTG6 


GGATCGTTGC 


AACTTG6AGG 


120 


ACAGAGACAG 


AAGTCAACTG 


TCTCCTTTGC 


TGCACTCCAC 


CACGGAGTGG 


GCCATTTTAC 


180 


CTTGCACTTA 


CTCGGACCTG 


CCCGCCTTGT 


CGACTGGTCT 


TCTCCACCTC 


CACCAAAACA 


240 


TCGT6GACGT 


GCAATTCATG 


TATGGCCTAT 


CACCTGCTCT 


CACAAAATAC 


ATCGTCC6AT 


300 


GGGAGTGGGT 


AGTACTCTTA 


TTCCTGCTCT 


TAGC6GACGC 


CAGGGTTTGC 


GCCTGCTTAT 


360 


GGATGCTCAT 


CTTGTTGGGC 


CAG6CC6AAG 


CAGCACTAGA 


GAAGTTGGTC 


GTCTTGCACG 


420 


CTGCGA6CGC 


AGCTAGCTGC 


AATGGCTTCC 


TATACTTTGT 


CATCTTTTTC 


GTGGCTGCTT 


480 


GGTACATCAA 


GGGTCGGGTA 


GTCCCCTTGG 


CTACTTATTC 


CCTCACTGGC 


CTATGGTCCT 


540 


TTG6CCTACT 


GCTCCTA6CA 


TTGCCCCAAC 


AGGCTTATGC 


TTATGACGCA 


TCTGTACATG 


600 


GTCAGATAG6 


AGCAGCTCTG 


TTGGTACTGA 


TCACTCTCTT 


TACACTCACC 


CCCGGGTATA 


660 


AGACCCTTCT 


CAGCCGGTTT 


CTGTG6TGGT 


TGTGCTATCT 


TCTGACCCTG 


GCGGAAGCTA 


720 


TGGTCCAGGA 


GTGGGCACCA 


CCTATGCAGG 


TGCGCGGTGG 


CCGTGATG6G 


ATCATATGGG 


780 


CCGTCGCCAT 


ATTCTGCCCG 


GGTGTGGTGT 


TTGACATAAC 


CAAGTGGCTC 


HGGCGGTGC 


840 


TTG6GCCTGC 


TTATCTCCTA 


AAAGGTGCTT 


TGACGCGTGT 


GCCGTACTTC 


GTCAGGGCTC 


900 


ACGCTCTACT 


AA6GATGTGC 


ACCATGGTAA 


GGCATCTCGC 


GGGGGGTAGG 


TACGTCCAGA 


960 


T6GTGCTACT 


AGCCCTTGGC 


A6GTGGACTG 


GCACTTACAT 


CTATGACCAC 


CTCACCCCTA 


1020 


TGTCGGATTG 


GGCTGCTAAT 


GGCCT6CGGG 


ACTTGGCGGT 


CGCC6TGGAG 


CCTATCATCT 


1080 


TCAGTCCGAT 


GGAGAAAAAA 


GTCATCGTCT 


GGGGAGCGGA 


GACAGCTGCT 


TGCGGGGATA 


1140 


TCTTACACGG 


ACTTCCCGTG 


TCCGCCCGAC 


HGGCCGGGA 


GGTCCTCCTT 


GGCCCAGCTG 1200 
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ATGGCTATAC 


CTCCAAGGGG 


TGGAGTCTTC 


TCGCCCCCAT 


CACTGCTTAT 


6CCCAGCAGA 


1260 


CACGCGGCCT 


TTTG6GCACC 


ATAGTGGTGA 


GCATGACGGG 


GCGCGACAAG 


ACAGAACAGG 


1320 


CCGGGGAGAT 


TCAGGTCCTG 


TCCACGGTCA 


CTCAGTCCTT 


CCTCGGAACA 


ACCATCTC6G 


1380 


G6GTCTTATG 


GACT6TCTAC 


CATGGAGCT6 


GCAACAAGAC 


TCTAGCCGGC 


TCACGGGGTC 


1440 


CGGTCACACA 


GATGTACTCC 


AGTGCTGAGG 


GGGACTTAGT 


GGGGTGGCCC 


AGCCCCCCCG 


1500 


GGACCAAATC 


TTTGGAGCCG 

III Vl^llXW^ 


TGCACGTGT6 


GAGCGGTCGA 


CCTATACCTG 


GTCACGCGAA 


1560 


ACGCTGATGT 


CATCCCGGCT 


CGAAGACGCG 


GGGACAAGCG 


AGGAGCGCTA 


CTCTCCCCGA 


1620 


GACCTCTTTC 


CACCTTGAAG 


GGGTCCTCGG 


GGGGCCC6GT 


GCTCTGCCCC 


AGAGGCCACG 


1680 


CTGTCGGGGT 


CTTCCG6GCA 


GCCGTGTGCT 


CCCGGGGCGT 


GGCCAAGTCC 


ATAGATTTTA 


1740 


TCCCCGTTGA 


GACACTTGAC 


ATCGTCACTC 


6GTCCCCCAC 


CTTTAGTGAC 


AACAGCACAC 


1800 


CACCTGCTGT 

V/r\vV/ 1 \A\J 1 \al 1 


GCCCCAAACT 


TATCAGGTCG 


GGTACTTACA 


TGCCCCGACT 


GGTAGTGGAA 


1860 


AGAGrAOCAA 


AGTCCCTGTC 


GCGTATGCCG 


CTCAGGGGTA 


CAAAGTGCTA 


GTGCTTAATC 


1920 


V/v 1 l/UU 1 UVjv 


TGCCACCCTG 


GGGTTTGGGG 


CGTACTTGTC 


CAAGGCACAT 


GGCATCAATC 


1980 


CCAACATTAG 


GACTGG6GTC 


AGGACTGTGA 


CGACCGGGGC 


GCCCATCACG 


TACTCCACAT 


2040 


ATGGCAAATT 


CCTCGCCGAT 


GGGG6CT6CG 


CAGGCGGC6C 


CTATGACATC 


ATCATATGCG 


2100 


ATGAATGCCA 

r\ 1 linn 1 uvv/n 


TGCCGTGGAC 


TCTACCACCA 

1 V 1 n\/\/owi» 


TTCTCGGCAT 


CG6AACAGTC 


CTCGATCAAG 


2160 


CAGAGACAGC 


CGGGGTCAGG 


CTAACTGTAC 


TGGCTACGGC 


TACGCCCCCC 


GGGTCAGTGA 


2220 


CAACCCCCCA 


CCCCAACATA 


GAGGAGGTGG 


CCCTCGGGCA 


GGAGGGTGAG 


ATCCCCTTCT 


2280 


ATGGGAGGGC 


GATTCCCCTG 


TCATACATCA 


AGGGAGGAAG 


ACACTTGATC 


TTCTGCCACT 


2340 


CAAAGAAAAA 


GTGTGACGAG 


CTCGCGGCGG 


CCCTTCG6GG 


TATGGGCTTG 


AACGCAGTGG 


2400 


CATACTACAG 


AGGGCTGGAC 


GTCTCCGTAA 


TACCAACTCA 


6GGAGACGTA 


6TGGTCGTCG 


2460 




i/UIV/AruAUu 


uuul 1 lAV/lu 


fjA/ifirTTTfiA 
uAuHlr 1 1 1 uH 


Ke 1 V/UU 1 UH 1 \i 






TAGCGGTCAC 


TCAAGTTGTA 


GACTTCAGCT 


TGGACCCCAC 


ATTCACCATA 


ACCACACAGA 


2580 


CTGTCCCTCA 


AGACGCTGTC 


TCACGTAGCC 


AGCGCCGGGG 


CCGCACGGGC 


AGGGGAAGAC 


2640 


TGGGTATTTA 


TAGGTATGTT 


TCCACTGGTG 


AGCGAGCCTC 


AGGAATGTTT 


GACAGTGTAG 


2700 


T6CTCTGCGA 


GTGCTACGAT 


GCAGGGGCCG 


CAT6GTATGA 


GCTCACACCA 


GCGGAGACCA 


2760 


CCGTCAGGCT 


CAGAGCATAT 


TTCAACACAC 


CT6GTTTGCC 


TGTGTGCCAA 


GACCATCTT6 


2820 


AGTTTTGGGA 


GCAGTTTTC 


ACCGGCCTCA 


CACACATAGA 


TGCCCACTTC 


CTTTCCCAAA 


2880 


CAAAGCAATC 


GGGGGAAAAT 


TTCGCATACT 


TAACAGCCTA 


CCAGGCTACA 


GTGTGCGCTA 


2940 
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GGGCCAAAGC CCCCCCCCCG TCCT6GGACG 
CCACACTCGT GGGCCCCACA CCTCTCCT6T 
CCCTCACGCA TCCTGTGACG AAATACATCG 
TGACCAGCAC GTG6GTCTTA GCTGGGGGGG 
CGACCGGGTG TGTTTGCATC ATCG6CCGCT 
CGGACAAGGA GGTCCTCTAT GAGGCTTTTG 
CTCTCATTGA AGAGGGGCA6 CG6ATAGCCG 
TGCAGCAAGC TTCCAAACAA GCTCAAGACA 
AGGTAGAGCA ATTCTGGGCC AAACACAT6T 
CAGGACTATC AACACTGCCA GGGAACCCTG 
CCCTCACCAG TCCGTTGTCA ACTAGCACCA 
TAGCATCCCA AATTGCGCCT CCCGCGGGGG 
GGGCTGCCGT AG6CAGCATA GGCTTGGGTA 
GTGCGGGCAT TTCGGGGGCT CTCGTCGCAT 
TGGAGGATGT TGTCAACCTG CTGCCTGGAA 
TCATCTGCGC GGCCATCCTG C6CCGACACG 
TGAATAGGCT CATTGCCTTT GCTTCCAGAG 
CGGAGTCGGA 3970 



TCATGTGGAA 


GTGTTTGACT 


CGACTCAAGC 


3000 


ACCGCTTGGG 


CTCTGTTACC 


AACGAGGTCA 


3060 


CCACCTQCAT 


GCAAGCCGAC 


CTTGAGGTCA 


3120 


TCTTGGCGGC 


CGTCGCCQCG 


TACTGCCTGG. 


3180 


TGCACGTTAA 


CCAGCGAGCC 


GTCGTTGCAC 


3240 


ATGAGATGGA 


GGAATGTGCC 


TCTAGAGCGG 


3300 


AGATGCTGAA 


6TCCAAGATC 


CAAGGCTTAT 


3360 


TACAACCCGC 


TGTGCAGGCT 


TCTTGGCCCA 


3420 


6GAACTTCAT 


CAGCGGCATT 


CAATACCTCG 


3480 


CTGTAGCTTC 


CATGATGGCA 


TTCAGTGCCG 


3540 


CTATCCTTCT 


CAACATTTTG 


GGGGGCTGGC 


3600 


CTACCGGCTT 


CGTCGTCAGT 


GGCCTGGTGG 


3660 


AGGTGCTGGT 


GGACATCCT6 


GCAGGGTAT6 


3720 


TCAAGATCAT 


GTCTGGCGAG 


AA6CCCTCCA 


3780 


TTCTGTCTCC 


6GGTGCCCTG 


GTGGTGGGAG 


3840 


TGGGACCGGG 


GGAAGGCGCT 


GTCCAATGGA 


3900 


6AAACCACGT 


CGCCCCCACC 


CACTACGTGA 


3960 
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Sequence ED No.4 
Sequence Length: 2,693 
Sequence Type: nucleic acid 
Strandedness: single 
Topology: linear 

Molecule Type: cDNA to genomic RNA 
Method for Determination of Feature: E 



ATTCTGTCTC 


AA AATA AAAT 

CGGGTGCCCT 


GGTGGTGGGA GTCAICIGCG 


(/CibCtA 1 (/(/ 1 


uOdl/tuAOAC 


60 


ATA/\A * A AAA 

GT6GGACCGG 


AAAA AAAAAA 

GGGAAGGCGC 


iGlCCAAIGd AlUAAl A(i(i(/ 


1 (/A 1 1 UOL 1 1 


l(/(/AUA 


120 


AAA A AAA AAA 

GGAAACCACG 


TA AAAA A A A A 

TCGCCCCCAC 


CCACIACGIG ACGGAGlCGG 


A 1 GCG 1 (/U(/A 


b(/G 1 G 1 GACC 




CAACTACTTG 


GCTCCCTTAC 


CATAACCA6C CTGCTCAGGA 


GACTCCACAA 


CTGGATTACT 


240 


GAA6ACTGCC 


CCATCCCATG 


CAGCGGCTCG T6GCTCCGCG 


ATGTGTGGGA 


TTGGGTTTGC 


300 


ACCATCCTAA 


CAGACTTTAA 


AAACTGGCTG ACCTCCAAAT 


TGTTCCCAAA 


GATGCCTGGT 


360 


CTCCCCTTTA 


TCTCTTQTCA 


AAAGGGGTAC AAGGGCGTGT 


GGGCTGGCAC 


TGGTATCATG 


420 


ACCACACGGT 


6TCCTTGCG6 


CGCCAATATC TCTGGCAATG 


TCCGCCTGGG 


CTCCATGAGA 


480 


ATTACGGGGC 


CCAAAACCTG 


CATGAATATC TGGCAGGGGA 


CCTTTCCCAT 


CAATTGTTAC 


540 


ACGGAGGGCC 


AGTGCGTGCC 


GAAACCCGCA CCAAACHTA 


AGATCGCCAT 


CTGGAGGGTG 


600 


GCGGCCTCAG 


AGTACGCGGA 


GGTGACGCAG CACGGGTCAT 


ACCACTACAT 


AACAGGACTT 


660 


ACCACTGATA 


ACTTGAAAGT 


TCCTTGCCAA CTACCTTCTC 


CAGAGTTCTT 


TTCCTGGGTG 


720 


GACGGAGTGC 


AGATCCATAG 


GTTTGCCCCC ATACCGAAGC 


CGTTTTTTCG 


GGAT6AGGTC 


780 


TCGTTCTGCG 


TTGGGCTTAA 


TTCATTTGTC GTCGGGTCTC 


AGCTCCCTTG 


C6ATCCTGAA 


840 


CCTGACACAG 


ACGTATTGAC 


GTCCATGCTA ACAGACCCAT 


CCCATATCAC 


GGCG6AGACT 


900 


6CA6CGCGGC 


GTTTGGCACG 


GGGGTCACCC CCGTCCGAGG 


CAAGCTCCTC 


AGCGAGCCAG 


960 


CTATCGGCAC 


CATCGCTGCG 


AGCCACCTGC ACCACCCACG 


GCAAGGCCTA 


TGATGTGGAC 


102a 


ATGGTGGATG 


CCAACCTGTT 


CATGGGGGGC GATGTGACCC 


GGATAGAGTC 


TGAGTCCAAA 


1080 


GTGGTCGTTC 


TGGACTCTCT 


CGACCCAATG GTCGAAGAAA 


GGAGCGACCT 


TGAGCCTTCG 


1140- 


ATACCATCGG 


AATATATGCT 


CCCCAAGAAG AGATTCCCAC 


CAGCCTTACC 


GGCTTGGGCA 


1200 
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(/butt luAl 1 


AtAAtttAtt 


Ulr 1 t U 1 UUAH 


1 UU 1 uuAHuH 


nnrrAHATTA 






At 1 U 1 1 utuU 


btlbtbtl tl 




AAftAAAArrr 




V/ U 1/ n n U \J n U n 




tuttbbAtAu 


Ibubit fuAu 




ATAftrAftAT(5 


V/vw 1 nvnrtV/n 


firififirPATr 

UV/ 1 UUV/V/n 1 V/ 


1 OOv 


AAG 1 CC M 1 b 


bttAuttttt 


tttHHUV/UUl/ 




TTTrrAPfififi 

1 1 1 V/V/nUUUU 


UUvUUni/UV/n ■ 




GCCuA i 1 ttu 


btAultuuAt 


UtUl/k/UlrUH 1 




TTTrnftAftAP 

1 1 1 \/UUHUnV/ 


A^^fiTTrCA TO 

nUU t 1 V/VrH 1 V/ 


1 Juv 


1 CC 1 ttA 1 bt 


tttt ItltuA 


buuuuAutt 1 


AAA AATAAA^ 


Ml/ 1 1 uUnUV/V/ 


THAftrAnftTA 


1 Sfi/) 


GAbCI ItAAt 


t 1 tttttttA 


uuuuuUuui u 






UUUU 1 V/ 1 1 UU 


1 U4 Vr 


TAT A ATTAAT 

1 C 1 AC 1 1 bt 1 


Af*^ A/^A Ar^r^ A 
ttuAubAbbA 


tuAt 1 Ul/Ul 1/ 




rrA TftTf'ATA 

IrV/n 1 u i Un 1 H 


Vf I l/V/ 1 UUrtV/V* 


IfiAft 
1 OOu 


/\/\/\/\/\TAT A A 

GGGuCTCiAA 


TA A ATAPTTr^ 

1 AAt 1 tt 1 lb 


i AbttttuAA 


^A/^^iAAAAf^T 


1 Ulrl/MA 1 1 UU 


rrrrTTftAnr 

IrUOl/ 1 1 UHUV/ 


1 740 

1 r Hyj 


A AATAAATAT 

AACTCCCTGl 


IbtbAl Al tA 


tAAtAAbui u 


TAPT^TArrA 


PATPAAAf^Aft 




1 AAA 


A A/^/^ATA AAA 

AGGGCTAAAA 


A AAT A M^TTT 

Abb 1 AAt I i 1 


TTATAr^nAT^^ 
1 bA I AUbA 1 b 


PA A^irnrirn 




1 UMV/ 1 V/rtU 1 \/ 




TT/^ A A ^^AA/^ A 

1 IGAAGGAOA 


TTA Ar/^TA 

1 1 AAbt 1 AUt 


AAAATAAA A/i 

bbtt 1 ttAAu 


^TPAPrfirAA 


nftpTTpTPAr 


1 1 1 MUHUUMU 




UOC 1 Ul/l/Au 1 


1 AAt ittAtt 




AfiATPf^AAAT 


M 1 UUU 1 f 1 UU 


ftfirTAAHfiAfi 

UU\/ 1 HnUUnU 


1<)Aft 


GlCCGCAGCI 


Ibl ttbbbAb 


Arr^^rTTA Af^ 
Abttbl lAAt 


AAPAirA Ar^T 


UV/U 1 U 1 UUHM 


UUHUl/ 1 \/\^ 1 U 


9A4A 




A A A/^A^r^A AT 
AAAtAttAAl 


TrrTAPAArr 

1 UU 1 




AAAAinAfifiT 

HMrtn 1 UHUU 1 


U 1 1 V/ 1 UV/U 1 u 




GACCCCACCA 


AGGGGGGTAA 


GAAA6CA6CT 


AAAATT A TAA 

CGCCTTATCG 


TTT A AAA TA A 

TTTACCCTGA 


AA TA AA AATA 

Ct ItbGtbIt 


0 1 C A 


AGGGTCTGCG 


AGAAAATGGC 


CCTTTATGAT 


ATCACACAAA 


AGCTTCCTCA 


GGCGGTGATG 


2220 


GGG6CTTCTT 


ATGGATTCCA 


GTACTCCCCC 


GCTCAGCGG6 


TGGAGTTTCT 


CTTGAAGGCA 


2280 


TGGGCGGAAA 


AGAAAGACCC 


TATGGGTTTT 


TC6TAT6ATA 


CCCGATGCTT 


TGACTCAACC 


2340 


GTCACTGAGA 


GAGACATCAG 


GACTGAG6AG 


TCCATATATC 


GGGCTTGTTC 


CTTGCCCGAG 


2400 


GA6GCCCACA 


CTGCCATACA 


CTCACTGACT 


GAGAGACTTT 


ACGTGGGAGG 


GCCCATGTTC 


2460 


AACAGCAAGG 


GCCAGAGCTG 


CGGGTACAGG 


CGTTGCCGCG 


CCAGCGGGGT 


GCTTACCACT 


2520 


AGCATGGGGA 


ACACCATCAC 


ATGCTATGTG 


AAAGCCTTAG 


CGGCCTGTAA 


GGCTGCAGGG 


2580 


ATAATTGCGC 


CCACAATGCT 


GGTATGCGGC 


GATGACTTGG 


TT6TCATCTC 


AGAGAGCCAG 


2640 


GGGACCGAGG 


AGGACGAGCG 


GAACCTGAGA 


GCCTTCACGG 


AGGCTATGAC 


CAG 2693 
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Sequence ED No.5 
Sequence Length: 3.033 
Sequence Type: amino acid 
Topology: linear 
Molecule Type: protein 



Het Ser Thr 
Asn Arg Arg 
Val Gly Gly 
Vai Arg Ala 
Arg Arg Gin 
Trp Gly Lys 
i.eu Gly Trp 
Ser Trp Gly 
Lys Val He 
Tyr He Pro 
Leu Ala His 



Asn Pro Lys 
5 

Pro Gin Asp 
20 

Val Tyr Leu 

35 

Thr Arg Lys 
50 

Pro He Pro 
65 

Pro Gly Tyr 
80 

Ala Gly Trp 
95 

Pro Asn Asp 

110 
Asp Thr Leu 

125 

Val Val Gly 

140 
Gly Val Arg 

155 



Pro Gin Arg 
Val Lys Phe 
Leu Pro Arg 
Thr Ser Glu 
Lys Asp Arg 
Pro Trp Pro 
Leu Leu Ser 
Pro Arg His 
Thr Cys Gly 
Ala Pro Leu 
Val Leu Glu 



Lys Thr Lys 
10 

Pro Gly Gly 
25 

Arg Gly Pro 
40 

Arg Ser Gin 
55 

Arg Ser Thr 
70 

Leu Tyr Gly 
85 

Pro Arg Gly 
100 

Arg Ser Arg 
115 

Phe Ala Asp 
130 

Gly Gly Val 
145 

Asp Gly Val 
160 



Arg Asn Thr 
15 

Gly Gin He 
30 

Arg Leu Gly 
45 

Pro Arg Gly 
60 

Gly Lys Ser 
75 

Asn Glu Gly 
90 

Ser Arg Pro 
105 

Asn Val Gly 
120 

Leu Het Gly 
135 

Ala Arg Ala 
ISO 

Asn Phe Ala 
165 
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Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala 

170 1 75 1 80 

Leu Leu Ser Cys He Thr Thr Pro Val Ser Ala Ala Glu Val Lys 

185 190 195 

Asn He Ser Thr Gly Tyr Met Val Thr Asn Asp Cys Thr Asn Asp 

200 205 210 

Ser He Thr Trp Gin Leu Gin Ala Ala Val Leu His Val Pro Gly 

215 220 225 

Cys Val Pro Cys Glu Lys Val Gly Asn Thr Ser Arg Cys Trp He 

230 235 240 

Pro Val Ser Pro Asn Val Ala Val Gin Gin Pro Gly Ala Leu Thr 

245 250 255 

Gin Gly Leu Arg Thr His He Asp Met Val Val Met Ser Ala Thr 

260 265 270 

Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly Val Met 

275 280 285 

Leu Ala Ala Gin Met Phe He Val Ser Pro Gin His His Trp Phe 

290 295 300 

Val Gin Asp Cys Asn Cys Ser He Tyr Pro Gly Thr He Thr Gly 

305 310 315 

His Arg Het Ala Trp Asp Het Het Hec Asn Trp Ser Pro Thr Ala 

320 325 330 

Thr Het He Leu Ala Tyr Ala Het Arg Val Pro Glu Val He He 

335 340 345 

Asp He He Gly Gly Ala His Trp Gly Val Het Phe Gly Leu Ala 

350 355 360 

Tyr Phe Ser Het Gin Gly Ala Trp Ala Lys Val Val Val He Leu 

365 370 375 

Leu Leu Ala Ala Gly Val Asp Ala Gin Thr His Thr Val Gly Gly 



39 



EP 0 532 167 A2 



Ser Thr Ala 
Gly Ala Arg 
His He Asn 
Gly Phe Leu 
Gly Cys Pro 
Arg Val Gly 
Pro Glu Asp 
Cys Gly Val 
Phe Thr Pro 
Ala Pro Thr 
Leu Asn Ser 
Trp Het Asn 
Cys Arg He 
Pro Thr Asp 



380 
His Asn Ala 

395 
Gin Lys He 

410 
Arg Thr Ala 

425 
Ala Ser Leu 

440 
Glu Arg Het 

455 
Trp Gly Ala 

470 
Het Arg Pro 

485 

Val Ser Ala 

500 
Ser Pro Val 

515 
Tyr Thr Trp 

530 
Thr Arg Pro 

545 
Ser Thr Gly 

560 
Arg Ala Asp 

575 
Cys Phe Arg 

590 



Arg Thr Leu 
Gin Leu He 
Leu Asn Cys 
Phe Tyr Thr 
Ser Ala Cys 
Leu Gin Tyr 
Tyr Cys Trp 
Ser Ser Val 
Val Val Gly 
Gly Glu Asn 
Pro Gin Gly 
Tyr Thr Lys 
Phe Asn Ala 
Lys His Pro 



385 

Thr Gly Het 
400 

Asn Thr Asn 
415 

Asn Asp Ser 
430 

His Ser Phe 
445 

Arg Ser He 
460 

Glu Asp Asn 
475 

His Tyr Pro 
490 

Cys Gly Pro 
505 

Thr Thr Asp 
520 

Glu Thr Asp 
535 

Ser Trp Phe 
550 

Thr Cys Gly 
565 

Ser Het Asp 
580 

Asp Thr Thr 
595 



390 

Phe Ser Leu 
405 

Gly Ser Trp 
420 

Leu His Thr 
435 

Asn Ser Ser 
450 

Glu Ala Phe 
465 

Val Thr Asn 
480 

Pro Arg Gin 
495 

Val Tyr Cys 
510 

Arg Leu Gly 
525 

Val Phe Leu 
540 

Gly Cys Thr 
555 

Ala Pro Pro 
570 

Leu Leu Cys 
585 

Tyr He Lys 
600 
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Cys Gly Ser Gly Pro Trp Leu Thr Pro Arg Cys Leu He Asp Tyr 

605 610 615 

Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn Tyr Thr He 

620 625 630 

Phe Lys lie Arg Hec Tyr Val Gly Gly Val Glu His Arg Leu Thr 

635 640 645 

Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys Asn Leu Glu Asp 

650 655 660 

Arg Asp Arg Ser Gin Leu Ser Pro Leu Leu His Ser Thr Thr Glu 

665 670 675 

Trp Ala lie Leu Pro Cys Thr Tyr Ser Asp Leu Pro Ala Leu Ser 

680 685 690 

Thr Gly Leu Leu His Leu His Gin Asn He Val Asp Val Gin Phe 

695 700 70S 

Het Tyr Gly Leu Ser Pro Ala Leu Thr Lys Tyr He Val Arg Trp 

710 715 720 

Glu Trp Val Val Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val 

725 730 735 

Cys Ala Cys Leu Trp Het Leu He Leu Leu Gly Gin Ala Glu Ala ' 

740 745 750 

Ala Leu Glu Lys Leu Val Val Leu His Ala Ala Ser Ala Ala Ser 

755 760 765 

Cys Asn Gly Phe Leu Tyr Phe Val He Phe Phe Val Ala Ala Trp 

770 775 780 

Tyr He Lys Gly Arg Val Val Pro Leu Ala Thr Tyr Ser Leu Thr . 

785 790 795 

Gly Leu Trp Ser Phe Gly Leu Leu Leu Leu Ala Leu Pro Gin Gin 

800 805 810 

Ala Tyr Ala Tyr Asp Ala Ser Val His Gly Gin He Gly Ala Ala 
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815 820 825 

Leu Leu Val Leu He Thr Leu Phe Thr Leu Thr Pro Gly Tyr Lys 

830 835 840 

Thr Leu Leu Ser Arg Phe Leu Trp Trp Leu Cys Tyr Leu Leu Thr 

845 850 855 

Leu Ala Glu Ala Met Vai Gin Glu Trp Ala Pro Pro Met Gin Val 

860 865 870 

Arg Gly Gly Arg Asp Gly He He Trp Ala Val Ala He Phe Cys 

875 880 885 

Pro Gly Vai Val Phe Asp He Thr Lys Trp Leu Leu Ala Val Leu 

890 895 900 

Gly Pro Ala Tyr Leu Leu Lys Gly Ala Leu Thr Arg Val Pro Tyr 

905 910 915 

Phe Val Arg Ala His Ala Leu Leu Arg Met Cys Thr Het Val Arg 

920 925 930 

His Leu Ala Gly Gly Arg Tyr Val Gin Het Val Leu Leu Ala Leu 

935 940 945 

Gly Arg Trp Thr Gly Thr Tyr He Tyr Asp His Leu Thr Pro Het 

950 955 960 

Ser Asp Trp Ala Ala Asn Gly Leu Arg Asp Leu Ala Val Ala Val 

965 970 975 

Glu Pro He He Phe Ser Pro Het Glu Lys Lys Val He Val Trp 

980 985 990 

Gly Ala Glu Thr Ala Ala Cys Gly Asp He Leu His Gly Leu Pro 

995 1000 1005 

Val Ser Ala Arg Leu Gly Arg Glu Val Leu Leu Gly Pro Ala Asp 

1010 1015 1020 
Gly Tyr Thr Ser Lys Gly Trp Ser Leu Leu Ala Pro He Thr Ala 

1025 1030 1035 
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Tyr Ala Gin Gin Thr Arg Gly Leu Leu Gly Thr He Val Val Ser 

1040 1045 1050 

Het Thr Gly Arg Asp Lys Thr Glu Gin Ala Gly Glu He Glu Val 

1055 1060 1065 

Leu Ser Thr Val Thr Gin Ser Phe Leu Gly Thr Thr He Ser Gly 

1070 1075 1080 

Val Leu Trp Thr Val Tyr His Gly Ala Gly Asn Lys Thr Leu Ala 

1085 1090 1095 

Gly Ser Arg Gly Pro Val Thr Gin Het Tyr Ser Ser Ala Glu Gly 

1100 1105 1110 

Asp Leu Val Gly Trp Pro Ser Pro Pro Gly Thr Lys Ser Leu Glu 

1115 1120 1125 

Pro Cys Thr Cys Gly Ala Val Asp Leu Tyr Leu Val Thr Arg Asn 

1130 1135 1140 

Ala Asp Val He Pro Ala Arg Arg Arg Gly Asp Lys Arg Gly Ala 

1145 1150 1155 

Leu Leu Ser Pro Arg Pro Leu Ser Thr Leu Lys Gly Ser Ser Gly 

1160 1165 1170 

Gly Pro Val Leu Cys Pro Arg Gly His Ala Val Gly Val Phe Arg 

1175 1180 1185 

Ala Ala Val Cys Ser Arg Gly Val Ala Lys Ser He Asp Phe He 

1190 1195 1200 

Pro Val Glu Thr Leu Asp He Val Thr Arg Ser Pro Thr Phe Ser 

1205 1210 1215 

Asp Asn Ser Thr Pro Pro Ala Val Pro Gin Thr Tyr Gin Val Gin 

1220 1225 1230 

Tyr Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 

1235 1240 1245 

Val Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
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1250 1255 1260 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Leu Ser Lys Ala 

1265 1270 1275 

His Gly lie Asn Pro Asn He Arg Thr Gly Val Arg Thr Val Thr 

1280 1285 1290 

Thr Gly Ala Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala 

1295 1300 1305 

Asp Gly Gly Cys Ala Gly Gly Ala Tyr Asp He He He Cys Asp 

1310 1315 1320 

Glu Cys His Ala Val Asp Ser Thr Thr He Leu Gly He Gly Thr 

1325 1330 1335 

Val Leu Asp Gin Ala Glu Thr Ala Gly Val Arg Leu Thr Val Leu 

1340 1345 1350 

Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Thr Pro His Pro Asn 

1355 1360 1365 

He Glu Glu Val Ala Leu Gly Gin Glu Gly Glu He Pro Phe Tyr 

1370 1375 1380 

Gly Arg Ala He Pro Leu Ser Tyr He Lys Gly Gly Arg His Leu 

1385 1390 1395 

He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Ala 

1400 1405 1410 

Leu Arg Gly Hex Gly Leu Asn Ala Val Ala Tyr Tyr Arg Gly Leu 

1415 U20 1425 

Asp Val Ser Val He Pro Thr Gin Gly Asp Val Val Val Val Ala 

1430 1435 1440 

Thr Asp Ala Leu Hec Thr Gly Phe Thr Gly Asp Phe Asp Ser Val 

1445 1450 1455 

He Asp Cys Asn Val Ala Val Thr Gin Val Val Asp Phe Ser Leu 

1460 U65 1470 
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Asp Pro Thr Phe Thr He Thr Thr Gin Thr Val Pro Gin Asp Ala 

1475 1480 1485 

Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Leu 

1490 1495 1500 

Gly He Tyr Arg Tyr Val Ser Thr Gly Glu Arg Ala Ser Gly Met 

1505 1510 1515 

Phe Asp Ser Val Val Leu Cys Glu Cys Tyr Asp Ala Gly Ala Ala 

1520 1525 1530 

Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala 

1535 1540 1545 

Tyr Phe Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu 

1550 1555 1560 

Phe Trp Glu Ala Val Phe Thr Gly Leu Thr His He Asp Ala His 

1565 1570 1575 

Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Phe Ala Tyr Leu 

1580 1585 1590 

Thr Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Lys Ala Pro Pro 

1595 1600 1605 

Pro Ser Trp Asp Val Hec Trp Lys Cys Leu Thr Arg Leu Lys Pro 

1610 1615 1620 

Trp Leu Val Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ser Val 

.1625 1630 1635 

Thr Asn Glu Val Thr Leu Thr His Pro Val Thr Lys Tyr He Ala 

1640 1645 1650 

Thr Cys Met Gin Ala Asp Leu Glu Val Het Thr Ser Thr Trp Val 

1655 1660 1665 

Leu Ala Gly Gly Val Leu Ala Ala Val Ala Ala Tyr Cys Leu Ala 

1670 1675 1680 

Thr Gly Cys Val Cys He He Gly Arg Leu His Val Asn Gin Arg 
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1685 1690 1695 

Ala Val Val Ala Pro Asp Lys Glu Val Leu Tyr Glu Ala Phe Asp 

1700 1705 1710 

Glu Het Glu Glu Cys Ala Ser Arg Ala Ala Leu He Glu Glu Gly 

1715 1720 1725 

Gin Arg He Ala Glu Het Leu Lys Ser Lys He Gin Gly Leu Leu 

1730 1735 1740. 

Gin Gin Ala Ser Lys Gin Ala Gin Asp He Gin Pro Ala Val Gin 

1745 1750 1755 

Ala Ser Trp Pro Lys Val Glu Gin Phe Trp Ala Lys His Het Trp 

1760 1765 1770 

Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu 

1775 1780 1785 

Pro Gly Asn Pro Ala Val Ala Ser Het Het Ala Phe Ser Ala Ala 

1790 1795 1800 

Leu Thr Ser Pro Leu Ser Thr Ser Thr Thr He Leu Leu Asn He 

1805 1810 1815 

Leu Gly Gly Trp Leu Ala Ser Gin He Ala Pro Pro Ala Gly Ala 

1820 1825 1830 

Thr Gly Phe Val Val Ser Gly Leu Val Gly Ala Ala Val Gly Ser 

1835 1840 1845 

He Gly Leu Gly Lys Val Leu Val Asp He Leu Ala Gly Tyr Gly 

1850 1855 I860 
Ala Gly He Ser Gly Ala Leu Val Ala Phe Lys He Het Ser Gly 

1865 1870 1875 
Glu Lys Pro Ser Het Glu Asp Val Val Asn Leu Leu Pro Gly He 

1880 1885 1890 
Leu Ser Pro Gly Ala Leu Val Val Gly Val He Cys Ala Ala He 

1895 1900 1905 
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Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met 

1910 1915 1920 

Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val Ala Pro 

1925 1930 1935 

Thr His Tyr Val Thr Glu Ser Asp Ala Ser Gin Arg Val Thr Gin 

1940 1945 1950 

Leu Leu Gly Ser Leu Thr He Thr Ser Leu Leu Arg Arg Leu His- 

1955 1960 1965 

Asn Trp He Thr Glu Asp Cys Pro He Pro Cys Ser Gly Ser Trp 

1970 1975 1980 

Leu Arg Asp Val Trp Asp Trp Val Cys Thr He Leu Thr Asp Phe 

1985 1990 1995 

Lys Asn Trp Leu Thr Ser Lys Leu Phe Pro Lys Hei Pro Gly Leu 

2000 2005 2010 

Pro Phe He Ser Cys Gin Lys Gly Tyr Lys Gly Val Trp Ala Gly 

2015 2020 2025 

Thr Gly He Met Thr Thr Arg Cys Pro Cys Gly Ala Asn He Ser 

2030 2035 2040 

Gly Asn Val Arg Leu Gly Ser Hec Arg He Thr Gly Pro Lys Thr 

2045 2050 2055 

Cys Met Asn He Trp Gin Gly Thr Phe Pro He Asn Cys Tyr Thr 

2060 2065 2070 

Glu Gly Gin Cys Val Pro Lys Pro Ala Pro Asn Phe Lys He Ala 

2075 2080 2085 

He Trp Arg Val Ala Ala Ser Glu Tyr Ala Glu Val Thr Gin His 

2090 2095 2100 

Gly Ser Tyr His Tyr He Thr Gly Leu Thr Thr Asp Asn Leu Lys 

2105 2110 2115 

Val Pro Cys Gin Leu Pro Ser Pro Glu Phe Phe Ser Trp Val Asp 
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2120 2125 2130 

Gly Val Gin He His Arg Phe Ala Pro lie Pro Lys Pro Phe Phe 

2135 2140 2145 

Arg Asp Glu Val Ser Phe Cys Vai Gly Leu Asn Ser Phe Vai Val 

2150 2155 2160 

Gly Ser Gin Leu Pro Cys Asp Pro Glu Pro Asp Thr Asp Val Leu 

2165 2170 2175 

Thr Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu Thr Ala 

2180 2185 2190 

Ala Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Glu Ala Ser Ser 

2195 2200 2205 

Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Arg Ala Thr Cys Thr 

2210 2215 2220 

Thr His Gly Lys Ala Tyr Asp Val Asp Met Val Asp Ala Asn Leu 

2225 2230 2235 

Phe Met Gly Gly Asp Val Thr Arg He Glu Ser Glu Ser Lys Val 

2240 2245 2250 

Val Val Leu Asp Ser Leu Asp Pro Met Val Glu Glu Arg Ser Asp 

2255 2260 2265 

Leu Glu Pro Ser He Pro Ser Glu Tyr Met Leu Pro Lys Lys Arg 

2270 2275 2280 

Phe Pro Pro Ala Leu Pro Ala Trp Ala Arg Pro Asp Tyr Asn Pro 

2285 2290 2295 

Pro Leu Val Glu Ser Trp Lys Arg Pro Asp Tyr Gin Pro Ala Thr 

2300 2305 2310 

Val Ala Gly Cys Ala Leu Pro Pro Pro Lys Lys Thr Pro Thr Pro 

2315 2320 2325 

Pro Pro Arg Arg Arg Arg Thr Val Gly Leu Ser Glu Ser Ser He 

2330 2335 2340 
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Ala Asp Ala Leu Gin Gin Leu Ala He Lys Ser Phe Gly Gin Pro 

2345 2350 2355 

Pro Pro Ser Gly Asp Ser Gly Leu Ser Thr Gly Ala Asp Ala Ala 

2360 2365 2370 

Asp Ser Gly Ser Arg Thr Pro Pro Asp Glu Leu Ala Leu Ser Glu 

2375 2380 2385 

Thr Gly Ser [le Ser Ser Het Pro Pro Leu Glu Gly Glu Pro Gly 

2390 2395 2400 

Asp Pro Asp Leu Glu Pro Glu Gin Val Glu Leu Gin Pro Pro Pro 

2405 2410 2415 

Gin Gly Gly Val Val Thr Pro Gly Ser Gly Ser Gly Ser Trp Ser 

2420 2425 2430 

Thr Cys Ser Glu Glu Asp Asp Ser Val Val Cys Cys Ser Met Ser 

2435 ' 2440 2445 

Tyr Ser Trp Thr Gly Ala Leu He Thr Pro Cys Ser Pro Glu Glu 

2450 2455 2460 

Glu Lys Leu Pro He Asn Pro Leu Ser Asn Ser Leu Leu Arg Tyr 

2465 2470 2475 

His Asn Lys Val Tyr Cys Thr Thr Ser Lys Ser Ala Ser Leu Arg 

2480 2485 2490 

Ala Lys Lys Val Thr Phe Asp Arg Het Gin Ala Leu Asp Ala His 

2495 2500 2505 

Tyr Asp Ser Val Leu Lys Asp He Lys Leu Ala Ala Ser Lys Val 

2510 2515 2520 

Thr Ala Arg Leu Leu Thr Leu Glu Glu Ala Cys Gin Leu Thr Pro 

2525 2530 2535 

Pro His Ser Ala Arg Ser Lys Tyr Gly Phe Gly Ala Lys Glu Val 

2540 2545 2550 
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Arg Ser Leu Ser Gly Arg Ala Val Asn His He Lys Ser Val Trp 

2555 2560 2565 

Lys Asp Leu Leu Glu Asp Thr Gin Thr Pro He Pro Thr Thr He 

2570 2575 2580 

Het Ala Lys Asn Glu Val Phe Cys Val Asp Pro Thr Lys Gly Gly 

2585 2590 2595 

Lys Lys Ala Ala Arg Leu He Val Tyr Pro Asp Leu Gly Val Arg 

2600 2605 2610 

Val Cys Glu Lys Met Ala Leu Tyr Asp He Thr Gin Lys Leu Pro 

2615 2620 2625 

Gin Ala Val Het Gly Ala Ser Tyr Gly Phe Gin Tyr Ser Pro Ala 

2630 2635 2640 

Gin Arg Val Glu Phe Leu Leu Lys Ala Trp Ala Glu Lys Lys Asp 

2645 2650 2655 

Pro Het Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 

2660 2665 2670 

Thr Glu Arg Asp He Arg Thr Glu Glu Ser He Tyr Arg Ala Cys 

2675 2680 2685 

Ser Leu Pro Glu Glu Ala His Thr Ala He His Ser Leu Thr Glu 

2690 2695 2700 

Arg Leu Tyr Val Gly Gly Pro Het Phe Asn Ser Lys Gly Gin Thr 

2705 2710 2715 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser 

2720 2725 2730 

Het Gly Asn Thr He Thr Cys Tyr Val Lys Ala Leu Ala Ala Cys 

2735 2740 2745 

Lys Ala Ala Gly He He Ala Pro Thr Het Leu Val Cys Gly Asp 

2750 2755 2760 
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Asp Leu Val Val He Ser Glu Ser Gin Gly Thr Glu Glu Asp Glu 

2765 2770 2775 

Arg Asn Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala 

2780 2785 2790 

Pro Pro Gly Asp Pro Pro Arg Pro Glu Tyr Asp Leu Glu Leu He 

2795 2800 2805 

Thr Ser Cys Ser Ser Asn Val Ser Val Ala Leu Gly Pro Gin Gly 

2810 2815 2820 

Arg Arg Arg Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro He Ala 

2825 2830 2835 

Arg Ala Ala Trp Glu Thr Val Arg His Ser Pro Val Asn Ser Trp 

2840 2845 2850 

Leu Gly Asn He He Gin Tyr Ala Pro Thr He Trp Ala Arg Met 

2855 2860 2865 

Val Leu Met Thr His Phe Phe Ser He Leu Hec Ala Gin Asp Thr 

2870 2875 2880 

Leu Asp Gin Asn Leu Asn Phe Glu Het Tyr Gly Ala Val Tyr Ser 

2885 2890 2895 

Val Ser Pro Leu Asp Leu Pro Ala He He Glu Arg Leu His Gly 

2900 2905 2910 

Leu Asp Ala Phe Ser Leu His Thr Tyr Thr Pro His Glu Leu Thr 

2915 2920 2925 

Arg Val Ala Ser Ala Leu Arg Lys Leu Gly Ala Pro Pro Leu Arg 

2930 2935 2940 

Ala Trp Lys Ser Arg Ala Arg Ala Val Arg Ala Ser Leu He Ser 

2945 2950 2955 

Arg Gly Gly Arg Ala Ala Val Cys Gly Arg Tyr Leu Phe Asn Trp 

2960 2965 2970 
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Ala Val Lys Thr Lys Leu Lys Leu Thr Pro Leu Pro Glu Ala Arg 

2975 2980 2985 

Leu Leu Asp Leu Ser Ser Trp Phe Thr Val Gly Ala Gly Gly Gly 

2990 2995 3000 

Asp He Tyr His Ser Val Ser Arg Ala Arg Pro Arg Leu Leu Leu 

3005 3010 3015 

Leu Gly Leu Leu Leu Leu Phe Val Gly Val Gly Leu Phe Leu Leu 

3020 3025 3030 

Pro Ala Arg 
3033 



52 



EPO 532 167A2 



Sequence ID No.t 

Sequence Length: 9,51 1 

Sequence Type: nucleic acid 

Strandedness: single 

Topology: linear 

Molecule Type: genomic RNA 

Method for Determination of Feature: E 



GCCCGCCCCC UGAUGGGGGC GACACUCCGC 
UCUUCACGCA GAAAGCGUCU AGCCAUGGCG 
CCCCCCUCCC GG6AGAGCCA UAGUGGUCUG 
AAAGACUGGG UCCUUUCUUG GAUAAACCCA 
GCAAGACUGC UAGCCGAGUA GCGUUGGGUU 
GURCUUGCGA GUGCCCCGGG AGGUCUCGUA 
CUCAAAGAAA AACCAAAAGA AACACAAACC 
GCGGUCAGAU CGUUGGCGGA GUUUACUUGC 
GCGC6ACAAG GAAGACUliCY GAGCGAUCCC 
AAGAUCGGCG CUCCACCGGC AAGUCCUGGG 
GAAACGAGGG UUGCGGCUGG GCGGGUUGGC 
GGGGCCCCAC CGACCCCCGG CAUAGAUCAC 
CGUGUGGUUU UGCCGACCUC AUGGGGUACA 
UCGCCAGAQC UCUGGCACAC GGUGUUAGGG 
GGAAUUUACC CGGUUGCUCU UUUUCUAUCU 
UGCCAGUGUC UGCAGUGGAA GUCAGGAACA 
GCUCAAACAA CAGCAUCACC UGGCAGCUCA 
UCCCAUGUGA GAAYGAUAAY GGCACCUUGC 
CUGUGAAACA CCGCGGUGCG CUCACUCGUA 



CAUGAAUCAC 


UCCCCUGUGA 


GGAACUACUG 


60 


UUAGUAUGAG 


UGUCGUACAG 


CCUCCAGGCC 


120 


CG6AACCGGU 


GAGUACACCG 


GAAUUACCGG 


180 


CUCUAUGUCC 


GGUCAUUUGG 


GCACGCCCCC 


240 


GCGAAAGGCC 


UUGUGGUACU 


GCCUGAUAGG 


300 


GACCGUGCAU 


CAUGAGCACA 


AAUCCUAAAC 


360 


GCCGCCCACA 


GGACGUUAAG 


UUCCCGGGUG 


420 


UGCCGCGCAG 


GGGCCCCAGG 


UUGGGUGUGC 


480 


AGCCGCGUGG 


ACGACGCCA6 


CCCAUCCCGA 


540 


GAAAGCCAGG 


AUAUCCUUGG 


CCCCUGUACG 


600 


UCCUGUCCCC 


CCGCGGGUCU 


CGUCCUACUU 


660 


GCAAUUUGGG 


CAGAGUCAUC 


GAUACCAUUA 


720 


UCCCUGUCGU 


UGGCGCCCCG 


GUYGGAGGCG 


780 


UCCUGGAGGA 


CGGGAUAAAU 


UACGCAACAG 


840 


UUUUGCUUGC 


UCUUCUGUCA 


UGCGUCACAR 


900 


UYAGUUCUAG 


CUACUACGCC 


ACUAAUGAUU 


960 


CUGACGCAGU 


UCUCCAUCUU 


CCUGGAUGCG 


1020 


RUUGCUGGAU 


ACAAGUAACA 


CCCRACGUGG 


1080 


GCCUGCGAAC 


ACACGUCGAC 


AUGAUCGUAA 


1140 
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UGGCAGCUAC 


GGCCUGCUCG 


GCCUUGUAUG 


UGGGAGAUGU 


GUGCGGGGCC 


GUGAUGAUYC 


1200 


UAUCGCAGGC 


UUUCAUGGUA 


UCACCACAAC 


GCCACAACUU 


CACCCAAGAG 


UGCAACUGUU 


1260 


CCAUCUACCA 


AGGUCACAUC 


ACC6GCCAUC 


GCAUGGCAUG 


GGACAUGAUG 


CURARCUGGU 


1320 


CUCCAACUCU 


URCCAUGAUC 


CUCGCCUACG 


CYGCUCGYGU 


UCCCGARCUG 


GUCCUCGAAA 


1380 


UYAUYUUCGG 


CGGCCAUUGG 


6GUGUGGYGU 


UYGGCUUGGS 


CUAUUUCUCC 


AUGCARGGAG 


1440 


CGUGGGCCAA 


A6UCRUYGCC 


AUCCUCCUUC 


UUGUUGCGGG 


AGUGGAUGCA 


WCCACCUAUU 


1500 


CCASCGGYCA 


GSAAGCGGGIi 


CGURCCGYCK 


HKGGGWUCKC 


URGCCUCUUU 


AHUACUGGUG 


1560 


CCAAGCAGAA 


CCUCYAUUUR 


AUCAACACCA 


AUGGCAGCUG 


GCACAUAAAC 


CGGACUGCCC 


1620 


UCAAUUGCAA 


UGACAGCYUA 


SAGACGGGUU 


UCHUCGCUUC 


CYUGKliUUAC 


WHCCRCARGU 


1680 


UCAACAGCUC 


UGGCUGCCCC 


GAGCGCUUGU 


CUUCCUGCCG 


CGGGCUGGAC 


GAYUUYCGCA 


1740 


UCGGCUGGGG 


AACCUUGGAA 


UAC6AAACCA 


ACGUCACCAA 


CGAUGRGGAC 


AUGAGGCCGU 


1800 


ACUGCU6GCA 


UUACCCCCCG 


AGGCCUUGCG 


GCAUCGUCCC 


GGCUAGGACG 


GUUUGCGGAC 


1860 


CGGUCUAUUG 


YUliCACCCCU 


AGCCCUGUUG 


UCGUGGGCAC 


CACUGACAAG 


CAGGGCGUAC 


1920 


CCACCUACAC 


CUGGGGRGAA 


AACGAGACCG 


AUGUCUUCCU 


GCRAAAUAGC 


ACAAGACCCC 


1980 


CGCGAGGAGC 


UUGGUUCGGC 


UGCACYUGGA 


UGAACGGGAC 


UGGGUUCACU 


AAGACAU6CG 


2040 


6UGCACCACC 


UUGCCGCAUU 


AGGAAAGACU 


ACAACAGCAC 


UCUCGAUUUA 


UUGUGCCCCA 


2100 


CAGACUGUUU 


UAGGAAGCAC 


CCAGAUGCUA 


CCUAUCUUAA 


GUGUGGAGCA 


GGGCCUUGGU 


2160 


UAACUCCCA6 


GUGCCUGGUA 


GACUACCCUU 


AUAGRYUGUG 


GCAUUAUCCG 


UGCACUGUAA 


2220 


ACUUCACCAU 


CUUYAAGGCG 


CGGAUGUAUG 


UAGGAGGGGU 


GGAGCAUCGA 


UUCUCCGCAG 


2280 


CAUGCAACUU 


CACGCGCGGA 


GAUCGCUGCA 


GACUGGAAGA 


UAGGGAUAGG 


GGYCA6CA6A 


2340 


GUCCACIiGCU 


GCAUUCCACU 


ACUGAGUGGG 


CGGUGYUCCC 


AUGCUCCUUC 


UCUGACCUAC 


2400 


CAGCACUAUC 


CACUGGCCUA 


UUGCACCUCC 


ACCAAAACAU 


CGUGGACGUG 


CAGUACCUYU 


2460 


ACGGACUUUC 


UCCGGCUCUG 


ACAAGAUACA 


UCGUGAAGUG 


GGAGUGGGUG 


AUCCUCCUUU 


2520 


UCUUGUUGUU 


GGCAGACGCC 


AGGRUCUGUG 


CAUGCCUUUG 


GAUGCUCAWC 


AUACUGGGCC 


2580 


AAGCCGAAGC 


GGCGCUUGAG 


AAGCUCAUCA 


UCUUGCACUC 


CGCUA6YGCU 


GCUAGUGCCA 


2640 


AUGGUCCGCU 


GUGGUUUUUC 


AUCUUCUUUA 


CAGCGGCCUG 


GUACUUAAAG 


GGCAGGGUGG 


2700 


UCCCCGU6GC 


CACGUACUCU 


GU6CUCGGCU 


URUGGUCCUU 


CCUCCUCCUA 


GUCCUGGCYU 


2760^ 


UACCACAGCA 


GGCUUAUGCC 


UUGGACGCUG 


CUGAACAAGG 


GGAACUGGGG 


CUG6CCAUAU 


2820. 


UAGUAAUUAU 


AUCCAUCUUU 


ACUCUUACCC 


CAGCAUACAA 


GAUCCUCCUG 


AGCCGUUCAG 


2880 
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UGUGGUGGCU 


GUCCUACAUG 


CUGGUCUUGG 


CCGAGGCCCA 6AUUCAGCAA 


UGGGUUCCCC 


2940 


CCCUGGAGGU 


CCGAGGGGGG 


CGUGACGGGA 


(JCAUCUGGGU GGCUGUCAUU 


CUACACCCAC 


3000 


GCCUUGUGUU 


UGAGGUCACG 


AAAUGGUUGU 


UAGCAAUCCU GGGGCCUGCC 


UACCUCCUUA 


3060 


RA6CGUCUCU 


GCUACGGAUA 


CCGUACUUUG 


U6AGGGCCCA CGCUUUGCUA 


CGAGUGUGUA 


3120 


CCCUGGUGAA 


ACACCUCGCR 


GGGGCUAGGU 


ACAUCCAGAU GCUGUURAUC 


ACCAUAGGCA 


3180 


GAUGGACCG6 


CACliUACAUC 


UACGACCACC 


UCUCCCCUUU AUCAACUUGG 


6CGGCCCAGG 


3240 


GUUURCGGGA 


CCUGGCAAliC 


GCCGUGGAGC 


CUGUGGUGUU CAGCCCAAUG 


GAGAAGAAGG 


3300 


UCAUUGUGUG 


GGGGGCUGAG 


ACAGUGGCGU 


GUGGAGACAU 


'CCUGCAUGGC 


CUCCCGGUCU 


3360 


CCGCGAGGCU 


AGGUAGGGAR 


GUUCUGCUCG 


GCCCUGCCGA 


CGGCUACACC 


UCCAAGGGGU 


3420 


GGAAKCUCCU 


AGCUCCCAUU 


ACUGCUUACA 


CUCAGCAAAC 


UCGUGGUCUC 


CUGGGUGCUA 


3480 


UCGUGGUCAG 


CCUAACGGGC 


CGC6ACAAAA 


AUGAGCAGGC 


UGGGCAGGUC 


CAGGUUCUGU 


3540 


CCUCCGUCAC 


ACAAACUUUC 


UUGGGGACAU 


CCAUUUCG6G 


CGUCCUCUGG 


ACAGUAUAUC 


3600 


ACGGGGCUG6 


UAAUAAGACC 


UUGGCCGGCC 


CCAAGGGACC 


AGUCACUCAG 


AUGUACACCA 


3660 


GCGCAGAAG6 


GGACCUCGUG 


GGAUGGCCUA 


GUCCCCCCGG 


GACUAAGUCA 


UUGGACCCCU 


3720 


GUACCUGCGG 


.GGCCGUAGAC 


CUCUACCU6G 


UCACCCGAAA 


CGCUGAUGUC 


AUUCCGGUCC 


3780 


GGAGGAAAGA 


UGACCGACGG 


GGUGCAUUAC 


UCUCGCCAAG 


GCCCCUCUCA 


ACCCUCAAAG 


3840 


GAUCAUCCG6 


AGGGCCCGUG 


CUCUGCUCWA 


GGGGACACGC 


CGUGGGCUUG 


UUCAGAGCGG 


3900 


CCGUGUGliGC 


CAGGGGUGUA 


GCCAAAUCUA 


UUGACUUCAU 


CCCCGUCGAA 


UCACUCGAUR 


3960 


UCGCCACACG 


GACGCCCAGU 


UUCUCUGACA 


ACAGURCGCC 


GCCAGCUGUG 


CCCCAGUCUU 


4020 


ACCAGGUGGG 


UUACIiUGCAC 


GCACCAACAG 


GCAGCGGAAA 


GAGCACCAA6 


GUCCCUGCCG 


4080 


CGUAUGCCAG 


UCAGGGGUAU 


AAAGUACUCG 


UACUAAAUCC 


CUCUGUCGC6 


GCCACACUUG 


4140 


UUUUUUUUUv 






GGAUCAACCC 


UAAUAUCAGA 


AriififiAfiiifir 

nl/ IFviunVJUUx/ 


4200 


GGACCGUUAC 


CACCGGGGAC 


UCUAUCACUU 


ACUCCACUUA 


UGGCAAGUUU 


AUCGCAGAUG 


4260 


GAGGCUGUGC 


AGCCGGliGCC 


UAU6ACAUCA 


UCAUAUGCGA 


CGAAUGCCAU 


UCAGUGGACG 


4320 


CUACUACCAU 


CCUUGGCAUU 


GGAACAGUCC 


UUGACCAAGC 


UGAGACCGCA 


GGCGUCAGGC 


4380 


UAGUGGUYUU 


GGCCACAGCC 


ACGCCUCCCG 


GUACGGUGAC 


AACUCCCCAC 


AGUAACAUAG 


4440 


AGGAG6UGGC 


CCUU6GUCAC 


GAGGGCGAGA 


UCCCUUUUUA 


UGGCAAAGCU 


AUUCCCCUAG 


4500 


CUUUCAUCAA 


GGGGGGCAGA 


CACUUGAUCU 


UUUGCCAUUC 


AAAGAAGAAG 


UGCGACGAGC 


4560 


UCGCAGCGGC 


CCUCCGGGGC 


AYGGGUGUCA 


AUGCCGUUGC 


AUACUAUAGG 


GGUCUCGACG 


4620 
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UCUCCGUUAU ACCAACUCAA GGAGACGUGG UGGUUGUCGC CACUGAUGCC CUAAUGACU6 4680 
GGUACACCGG C6ACUUU6AC UCYGUCAUCG ACUGUAAUGU UGCAGUCUCU CAGAUU6UU6 4740 
ACUUCAGCCU AGACCCAACC UUCACCAUCA CCACUCAAAC CGUCCCUCAG GACGCUGUCU 4800 
CCCGUAGUCA ACGUAGAGGG AGAACU6GGA GGGGGCGAUU GGGCRUUUAC AGGUAUGUUU 4860 
CGUCAGGYGA RRGGCCGUCl) GGGAUGUUCG ACAGCGUAGU GCYCUGCGA6 UGCUAUGAUG 4920 
CCGGGGCAGC CUGGUACGAG CUUACACCUG CUGAGACUAC GGUGAGACUC CG66CYUAUU 4980 
UCAACACGCC CGGUUUGCCC GUAUGUCAAG ACCACCUGGA GUUCUGGGAA GC6GUCUUUA 5040 
CAG6UCUCAC WCACAUURAC GCCCACUUCC UCUCCCAGAC GAAGCAAGGA GGAGAAAACU 5100 
UUGCRUAUCU AACGGCCUAC CAGGCCACAQ UAUGC6CCAG GGCAAAGGCC CCUCCUCCUU 5160 
CGUGGGACGU GAUGUGGAAG UGUCUAACUA GGCUCAAACC UACACU6ACU 6GUCCCACCC 5220 
CCCUCCU6UA CCGCUUGGGU GCCGUGACCA AU6AGGUYAC CUUGACGCAC CCCGUGACGA 5280 
AAUACAUCGC CACGUGCAUG CAAGCUGACC UYGAGAUCAU GACAAGCUCA UGGGUCCUGG 5340 
CGGGGGGGGU GCUAGCCGCC GUGGCAGCUU ACUGCCUGGC GACUG6CUGC AUUUCCAUCA 5400 
UUGGCCGCCU ACACCUGAAl) GAUCGGGUGG UUGUGRCCCC YGACAAGGAR AUCUUAUAU6 5460 
AGGCCUUU6A U6AGAUGGAA GAAUGCGCCU CCAAAGCCGC CCUCAUUGAG GAAGGGCAGC 5520 
GGAUGGC6GA GAUGCUCAAA UCUAAGAUAC AAGGCCUCCU ACAACAGGCC ACAA6GCAAG 5580 
CUCAAGRCAJ RCAGCCAGCU AUACAGUCAU CAUGGCCCAA 6CUUGAACAA UUUUGGGCCA 5640 
AACACAU6UG GAACUUCAUC AG.UGGUAUAC AGUACCUA6C AGGACUCUCC ACCCUACCGG 5700 
GAAAUCCUGC AGURGCAUCA AUGAUGGCUU UUAGCGCCGC GCUGACUAGC CCACUACCCA 5760 
CCAGCACCAC CAUCCUCUUG AACAUCAUGG GAGGAUGCUU GGCCUCYCA6 AUUGCCCCCC 5820 
CUGCCGGAGC CACYGGCUUC GUUGUCAGU6 GUCUAGUGGG G6CGGCCGUC GGAAGCAUAG 5880 
GCCUGG6UAA GAUACUGGUG GACGUUUUGG CCGGGUACGG CGCAG6CAUU UCAGGGGCCC 5940 
UCGUAGCUUU UAAGAUCAUQ AGCGGCGAGA AGCCCACGGU AGAAGACGUU GU6AAUCUCC 6000 
UGCCUGCUAU YCUGUCUCCU GGUGCGYUGG UAGUGGGA6U CAUCUGUGCA GCAAUYCUGC 6060 
6CCGCCACGU CGGUCAGGGA 6AGGGRGCGG UCCAGUGGAU GAACAGACUG AUCGCCUUCG 6120 
CCUCCAGGGG AAACCACGUU GCCCCUACCC ACUACGUGGU G6AGUCUGAC GCUUCACAGC 6180 
GUGURACGCA GGU6CUGAGU UCACUUACAA UUACCAGCUU ACUUAGGAGA CUACAUGCCU 6240 
GGAUCACUGA AGAUUGCCCA RUCCCAUGCU CGGGGUCUUG GCUCCAGGAC AUUUG6GAUU 6300 
GGGUUUGUUC CAUCCUCACA GACUUYAAAA ACUGGCUGUC UUCAAAAUUA CUCCCCAAGA 6360 
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UGCCCGGCAU 


UCCCUUUAUC 


UCUUGCCAGA AGGGAUACAA 


GGGUGUAUGG 


GCUGGUACGG 


6420 


GUGUCAUGAC 


YACUCGRURC 


CCAUGU6GAG CAAACAUCUC 


GGGCCAUGUC 


CGCAUGGGCA 


6480 


CCAU6AAAAU 


AACA6GCCCG 


AAGACUUGCU UGAACCUGUG 


GCAGGGGACU 


liUCCCCAUUA 


6540 


AUU6UUACAC 


AGAAGGGCCY 


UGCGU6CCAA AACCCCCUCC 


UAAUUACAAG 


ACCGCAAUUU 


6600 


GGAGGGUGGC 


AGCGUCGGAG 


UACGUUGAGG UCACACAGCA 


UGGCUCUUliC 


UCGUAUGUAA 


6660 


CRGGGUUAAC 


CAGU6ACAAC 


CUUAAGQUYC CUUGCCAGGU 


ACCAGCUCCA 


GAAUUUUUCU 


6720 


CUUGGGUGGA 


CGGGGUGCAA 


AUCCACCGAU UCGCCCCCGU 


WCCAGGUCCC 


UtiCUUUCGGG 


6780 


AUGAGGUAAC 


GUUCACCGUA 


G6CCUUAACU CCUUCGUGGU 


CGGCUCUCAG 


CUCCCUUGC6 


6840 


AUCCUGAGCC 


6GACACCGAR 


GUACUGGCCU CYAUGUUGAC 


AGACCC6UCC 


CACAUCACCG 


6900 


CKGAGGCGGC 


AGCCAGGCGA 


UUGGCAAGGG GAUCUCCCCC 


YUCACA6GCU 


AGCUCCUCAG 


6960 


CGAGCCAGCU 


CUCUGCCCCG 


UCCUUGAAG6 CUACCUGUAC 


CACCCAUAAG 


ACAGCAUAUG 


7020 


AUUGUGACAU 


GGUGGAUGCY 


AACCUUUUCA UGGGAGGHGA 


UGUGAYCCGG 


AUUGAGUCUG 


7080 


ACUCUAAGGli 


GAUCGUUCUA 


GACUCCCUCG AUUCCAUGAC 


UGAGGUAGAG 


GAUGAUCGUG 


7140 


AGCCUUCU6U 


ACCAUCAGAG 


UACCUGAUCA AGAGGAGAAA 


GUUCCCACCG 


GCGCUGCCUC 


7200 


CUUGGGCCCG 


UCCA6ACUAC 


AAUCCUGUUU UGAUCGAGAC 


AtiGGAAGAGG 


CCGGGCUAUG 


7260 


AACCACCCAC 


UGUCCUAGGC 


UGUGCCCUCC CCCCCACACY 


UCAAACGCCA 


GUGCCUCCAC 


7320 


CUCG6AGGCG 


CCGCGCYAAA 


RUCCUGACCC AGGACRAUGU 


GGAGGGGRUC 


CUCAGGGAGA 


7380 


UGGCUGACAA 


AGURCUCAGC 


CCUCUCCAAG ACAACAAUGA 


CUCCGGUCAC 


UCCACUGGAG 


7440 


CGGAUACCGG 


AGGAGACAUC 


GUCCAGCAAC CCUCUGACGA 


GACUGCCGCU 


UCAGAAGCGG 


7500 


GGUCACUGUC 


CUCCAUGCCU 


CCCCUUGAGG GAGAGCCGGG 


AGACCCYGAC 


CUGGAGUUUG 


7560 


AACCA6UGGG 


AUCCGCUCCC 


CCUUCUGAGG GGGAGUGUGA 


GGUCAUUGAU 


UCGGACUCUA 


7620 


AGUCGUGGUC 


CACAGUCUCU 


GAUCAAGAGG AUUCUGUUAU 


CUGCUGCUCU 


AUGUCAUACU 


7680 


CCUGGACGGG 


GGCCCUCAUA 


ACACCAUGUG GQCCCGAAGA 


GGAGAAGUUA 


CCGAUCAACC 


7740 


CUCUGAGUAA 


UUCGCUCAUG 


CGGUUCCAUA AYAAGGUGUA 


CUCCACAACC 


UCGAGGAGUG 


7800 


CCUCUCUGAG 


GGCAAAGAAG 


GUGACUUUUG ACAGGGUGCA 


GGUGCUGGAC 


GCACACUAUG 


7860 


ACUCAGUCUU 


GCAGGACGUU 


AA6CGGGCCG CCUCUAAGGU 


URGUGCGAGG 


CUCCUCACAG 


7920 


UAGAGGAAGC 


CUGCGC6CUG 


ACCCCGCCCC ACUCCGCCAA 


AUCGCGAUAC 


GGAUUUGGGG 


7980 


CAAAAGAGGU 


GCGCAGCUUA 


UCCAGGAGGG CCGUUAACCA 


CAUCCGGUCC 


GUGUGGGAGG 


8040 


ACCUCCUGGA 


AGACCAACRU 


ACCCCAAUUG ACACAACUAU 


CAUGGCUAAA 


AAUGAGGUGU 


8100 
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UCUGCAUUGA 


UCCAACUAAR 


GGUGGGAAAA 


AGCCAGCUCG 


CCUCAUCGUA 


UACCCC6ACC 


8160 


UUGGGGUCAG 


GGUGUGCGAA 


AAGAUGGCCC 


UCUAU6ACAU 


CRCACAAAAG 


CUUCCCAAAG 


8220 


CGAUAAUGGG 


6CCAUCCUAU 


GGGUUCCAAU 


ACUCUCCCGC 


AGAACGGGUC 


GAUUUCCUCC 


8280 


UCAAA6CUUG 


GGGAAGUAAG 


AA6GACCCAA 


UGGGGUUCUC 


GUAUGACACC 


CGCUGCUUUG 


8340 


ACUCAACCGU 


CACGGAGAGG 


GACAUAAGAA 


CAGAAGAAUC 


CAUAUAUCAG 


GCUUGUUCUC 


8400 


UGCCUCAAGA 


AGCCAGAACU 


GUCAUACACU 


CGCUCACUGA 


GAGACUUUAC 


GUAGGAGGGC 


8460 


CCAUGACAAA 


CAGCAAAGGG 


CAAUCCUGCG 


GCUACAGGCG 


UUGCCGCGCA 


AGCGGKGUUU 


8520 


UCACCACCAG 


CAUGGGGAAU 


ACCAUGACAU 


GUUACAUCAA 


AGCCCUUGCA 


GCGUGUAAGG 


8580 


CUGCR6GGAU 


CGU6GACCCU 


GUUAUGUUGG 


UGUGUGGAGA 


CGACCUGGUC 


GUCAUCUCAG 


8640 


AGAGCCAAGG 


UAACGAGGAG 


GACGAGCGAA 


ACCUGAGAGC 


UUUCACGGAG 


GCUAUGACCA 


8700 


GGUAUUCCGC 


CCCUCCC6GU 


GACCUUCCCA 


GACCGGAAUA 


UGACUU6GAG 


CUUAUAACAU 


8760 


CCU6CUCCUC 


AAACGUAUC6 


GUAGCGCUGG 


ACUCUCGGGG 


UCGCCGCCGG 


UACUUCCUAA 


8820 


CCAGA6ACCC 


UACCACUCCA 


AUCACCCGAG 


CUGCUUGGGA 


AACAGUAAGA 


CACUCCCCUG 


8880 


UCAAUUCUUG 


GCUGGGCAAC 


AUCAUCCAGU 


ACGCCCCCAC 


AAUCUGGGUC 


CGGAUGGUCA 


8940 


UAAUGACUCA 


CUUCUUCUCC 


AUACUAUUG6 


CCCAGGACAC 


UCUGAACCAA 


AAUCUCAAUU 


9000 


UUGAGAUGUA 


CGGGGCAGliA 


UACUCGGUCA 


AUCCAUUA6A 


CCUACC6GCC 


AUAAUUGAAA 


9060 


GGCUACAUGG 


GCUUGAAGCC 


UUUUCACUGC 


ACACAUACUC 


UCCCCACGAA 


CUCUCACGGG 


9120 


UGGCAGCAAC 


UCUCAGAAAA 


CUUGGAGCGC 


CUCCCCUUAG 


AGCGUGGAAG 


AGUCGGGCGC 


9180 


GUGCCGUGAG 


AGCUUCACUC 


AUCGCCCAAG 


GAGCGAGGGC 


GGCCAUUUGU 


GGCCGCUACC 


9240 


UCUUCAACUG 


GGCGGUGAAA 


ACAAA6CUCA 


AACUCACUCC 


AUUGCCCGAG 


GCGAGCCGCC 


9300 


UGGAUUUAUC 


CGGGUGGUUC 


ACCGUGGGCG 


CCGGCGGGGG 


CGACAUUUAU 


CACAGCGUGU 


9360 


CGCAUGCYCG 


ACCCCGCCUA 


UUACUCCUUU 


GCCUACUCCU 


ACUUAGCGUA 


GGAGUAGGCA 


9420 


UCUUUUUACU 


CCCCGCUCGG 


UAGAGCGGCA 


AACYCUAGCU 


ACACUCCAUA 


GCUAGUUUCC 


9480 


GUUUUUUUUU 


UUUUUUUUUU 


UUUUUUUUUU 


U 9511 
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Sequence ID No. 7 
Sequence Length: 9,51 1 
Sequence Type: nucleic acid 
Strandedness: single 
Topology: linear 

Molecule Type: cDNA to genomic RNA 
Method for Determination of Feature: E 



GCCCGCCCCC 


TGATGGGGGC 


GACACTCCGC 


CATGAATCAC 


TCCCCTGTGA 


GGAACTACTG 


SO 


TATTA AAAA A 

TCTTCACGCA 


A* A ArtAATAT 

6AAAGCGTCT 


AGCCATG6CG 


TTA6TATGAG 


TGTCGTACAG 


CCTCCAGGCC 


120 


AAAAAATAAA 

CCCCCCTCCC 


AAAA AA AA/\A 

GGGAGAGCCA 


TAGTGGTCTG 


CGGAACCGGT 


GAGTACACCG 


GAATTACCGG 


180 


AAAGACTGGG 


TCCTTTCTTG 


GATAAACCCA 


CTCTATGTCC 


GGTCATTTGG 


GCACGCCCCC 


240 


GCAA6ACTGC 


TAGCCGAGTA 


GCGTTGGGTT 


GCGAAAGGCC 


TTGTGGTACT 


GCCT6ATAGG 


300 


QTRCTTGC6A 


GTGCCCCGG6 


AGGTCTC6TA 


GACCGT6CAT 


CATGAGCACA 


AATCCTAAAC 


360 


CTCAAAGAAA 


AACCAAAAGA 


AACACAAACC 


GCC6CCCACA 


GGACGTTAAG 


TTCCCGGGTG 


420 


GCGGTCAGAT 


CGTTGGCGGA 


GTTTACTTGC 


TGCC6CGCAG 


GGGCCCCAGG 


TTGGGTGTGC 


480 


GCGCGACAAG 


GAAGACTTCY 


GAGC6ATCCC 


AGCCGCGTGG 


ACGACGCCAG 


CCCATCCCGA 


540 


AAGATCGGC6 


CTCCACCGGC 


AAGTCCTGGG 


GAAAGCCAGG 


ATATCCTTGG 


CCCCTGTAC6 


600 


QAAACGAGGG 


TTGCGGCTGG 


GCGGGTTGGC 


TCCTGTCCCC 


CCGCGGGTCT 


CGTCCTACTT 


660 


GGGGCCCCAC 


CGACCCCCGG 


CATAGATCAC 


GCAATTTGGG 


CAGAGTCATC 


GATACCATTA 


720 


CGTGTGGTTT 


TGCC6ACCTC 


ATGGGGTACA 


TCCCTGTCGT 


TG6CGCCCCG 


GTYGGAGGCG 


780 


TCGCCAGA6C 


TCTG6CACAC 


GGTGTTAGGG 


TCCTGGAGGA 


CGGGATAAAT 


TACGCAACAG 


840 


GGAATTTACC 


CGGTTGCTCT 


TTTTCTATCT 


TTTT6CTTGC 


TCTTCTGTCA 


TGCGTCACAR 


900 


TGCCAGTGTC 


TGCAGTG6AA 


GTCAG6AACA 


TYA6TTCTAG 


CTACTACGCC 


ACTAATGATT 


960 


GCTCAAACAA 


CAGCATCACC 


TGGCAGCTCA 


CTGACGCAGT 


TCTCCATCTT 


CCTGGATGCG 


1020 


TCCCATGTGA 


GAAYGATAAY 


GGCACCTTGC 


RTTGCTGGAT 


ACAAGTAACA 


CCCRACGTGG 


1080 


CTGTGAAACA 


CCGCGGTGCG 


CTCACTCGTA 


GCCT6CGAAC 


ACACGTCGAC 


ATGATCGTAA 


1140 


TGGCAGCTAC 


GGCCTGCTCG 


GCCTTGTATG 


TGG6AGATGT 


GTGCGGGGCC 


GTGATGATYC 


1200 
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1 AlCGCAGGC 


TTTA A T AAT A 

TTTCATGGiA 


TAA AAA AA A A 

1 LACLALAAt 


AAA A AA A ATT 

bCLACAAC i 1 


A A AAA A AAAA 

LALLuAAbAb 


TAAA AATATT 
1 ubAAb i b 1 1 




AAA TAT A AA A 
(/(/A 1 (/ 1 Alt A 


A AATA A AA TA 

Auu rOACA 1 0 


A AAAAAA ATA 


AA ATAAAA TA 


AAAAA T/iAT^ 


1/ 1 I\Mi\V/ 1 UU 1 


1 ocv 


ATAAA A ATAT 

CTCCAACiCI 


TDAA A T^^A TA 

1 KCCAIGAIl 


ATAAAATA AA 


AVAATAAVAT 


TAAAAADAT/^ 


ATAATAAA A A 

b 1 bb 1 bbAAA 


1 oo\} 


TVA TVTT^/^^ 

1 YA 1 Y M (/(ab 


AAA/\/\i TXy\/\ 


AATATAAVAT 


TVAAATTAAC 

1 YbUi/l 1 bbo 


ATATTTAT^O 


ATAAA pAAAA 

A 1 ubHnuUAu 


1 AAA 


AATAA/^A/^A A 


Alall/KI YUl/l 


A TAATAATTA 
A 1 (/(/ 1 tU 1 1 1/ 


TTATTAAA^^ 

1 1 bl Ibbbbu 


AATAA ATAAA 

Au 1 buAl bi/A 


WbbAbb 1 A 1 1 


1 ^AA 


tlAoUUUYLA 


AO A A A/^f*r^OT 


AATDAAAVAl^ 


UVAAAIJTAi^A 


TDAAATPTTT 
1 f\uLb 10 1 M 


An 1 Ab 1 bu 1 b 


1 t;fiA 
IOO\/ 


AAA A AAA AA A 

CCAAGuAliAA 


AATAVA TTTD 


A TAA AAA AAA 
AI(/AAtA{/UA 


ATAAAA AATA 

A 1 bbUAbb 1 b 


AAA AATA AAA 
bUAuA 1 AAAt 


AAAA ATAAAA 
bbbAb 1 bbbb 




TA A A TTAAA A 

1 OAA 1 1 bCAA 


TA A A A AAUT A 

luAOAuOYI A 


C A A A AAAATT 


TAUTAAATTA 


AVTAI^TTTA A 
l/Y 1 bM t 1 Ab 


UUAA I>AADAT 
WMbbKbAnbl 


1 Dov; 


TAA A A A AATA 

ICAACAGCTC 


TAAATAAAAA 

iGGClGCCCC 


A A AAAATTAT 


ATTAATAAAA 
(/I l(/(/lbOl/b 


AAAAA TAA A A 

bbbbblbbAb 


AAVTTVAAAA 
bAY 1 1 TbbbA 


1 TAA 


T AAAATA AA A 

TCGGCTGG6G 


A A AATTAA A A 

AACC 1 1 GGAA 


T A AAA A A AAA 
1 Al/UAAAl/(/A 


A AATAA AA A A 

Alrbl OALi/AA 


AAA TADAAA A 
UbA IbKbbAlr 


ATAAAAAAAT 
Al bAbbbbbI 


1 AAA 


A AT AATAAA A 

ACTGCTGGCA 


TT A A AAAAAA 

TTACCCCCCG 


A AAAATT AAA 

AGGCCl IQCG 


AA ATAATAAA 

bOAl tbl(/Ul/ 


AAATA AAAAA 

bbb 1 AbbAbb 


ATTTAAAA AA 

bl 1 IbbbbAb 


1 OCA 


AAA TAT A TT A 

CGGTCTATTG 


UTTA A AAAAT 

YTTCACCCCT 


A AAAATA TTA 

AuCCUlbl iG 


TAATAAAA AA 

l(/bl bbbLAt 


A AATAAA A A A 

LAb 1 bAUAAb 


AAAAAAATAA 

bAbbbbbI Ab 


1 QOA 


AAA AAT A A A A 

CCACCTACAC 


ataaaaoa a a 

CTGGGGRGAA 


A AAAAAA AAA 

AACGAGACCG 


A TAT ATTAAT 
AlblOM(/(/l 


AATDA A T A AA 

bCI KAAI Abb 


A A A A AAAAAA 

AbAAbAbbbb 


1 AO A 


AAA A A A A A AA 

CGCGAGGAGC 


TT AATTAA A A 

TTGGTICGGC 


TAA A A VTA A A 

IGCACYIuGA 


TA A A AAAA A A 

IbAAbbbbAu 


TAAATTA AAT 
IbbbI ILAbl 


A A AAA A TAAA 

AAbAbA 1 bbb 




ATAAA A A AAA 

GTGCACCACC 


TT AAAAAA TT 

TTGCCGCA 1 [ 


A AA A A A A A AT 

AGGAAAGAO 1 


AAA AAA AA A A 

ACAACAbOAO 


TATAA ATTT A 
IblbbAI 1 1 A 


TTATAAAAAA 

1 IblbbbbbA 


0 1 AA 


AAAAATATTT 

CAGACTGTTT 


T A A A A A AA A A 

lAGGAAGCAC 


A A A A A TAAT A 

CCAGAIGCIA 


AATA TATT A A 
(/(/ 1 A 1 L 1 1 AA 


ATATAAA AA A 

bIbtbbAbLA 


AAAAATTAAT 

bbbbb 1 1 bb 1 


0 I^A 


TAA ATAAA A A 

TAAClCCCAG 


ATAAATAAT A 

G 1 GCC 1 6G 1 A 


A A ATA AAATT 

GACIACOCI 1 


ATA AOVTATA 

Al AbKYIbib 


AAA TTATAAA 

bUAl 1 Albbb 


TAAAATATA A 
1 bbAb 1 b 1 AA 


Lily) 


A ATTAA/^A A T 
AC 1 1 CACLA 1 


ATTU A A AAAA 

11 1 YAAGuOG 


AAAATAT ATA 

tbuAIUl AIG 


TA AA AAAAAT 

1 AbbAbbbbl 


AAA AA ATAAA 

bbAbbAlbbA 


TTATAAAAAA 
1 IblbbbbAb 




A * TA/\ 4 * ATT 

(/A 1 GCAAC 1 1 


(/AOuLuUGuA 


A ATAAATAAA 


AA AT AA A AAA 

bAO ! bbAAbA 


T AAAAATAAA 

1 AbbbAI Abb 


AAVAA AAA AA 

bbYbAbbAbA 


0 "J A A 


r^Tf^r^A/^Tr^^T 


AA ATTAOA AT 


A ATAAATAAA 


AAATAVTAAA 

Lbbt bY i V/lrb 


A TAATAATTA 
A IbbI bb 1 1 b 


TATAAAATAA 
lb 1 bAbbI Ab 


0 AAA 


A A AAA AT ATA 

LAuLAl/t Al t 


AA ATAAAATA 


TTAAA AATAA 


AAAA A A A A AT 

Al/l/AAAA(/AI 


AATAAAAATA 

bblbbAbul b 


AAATAAATVT 
bAbI Abb 1 Y 1 




ACGGACTTTC 


TCCGGCTCTG 


ACAAGATACA 


TCGTGAAGTG 


GGAGTGG6TG 


ATCCTCCTTT 


2520 


TCTTGTTGTT 


GGCAGACGCC 


AGGRTCT6TG 


CATGCCTTTG 


GATGCTCAWC 


ATACT6GGCC 


2580 


AAGCCGAAGC 


GGCGCTTGAG 


AAGCTCATCA 


TCTTGCACTC 


CGCTAGY6CT 


GCTAGTGCCA 


2640 


ATGGTCCGCT 


GTGGTTTTTC 


ATCTTCTTTA 


CAGCG6CCTG 


6TACTTAAAG 


GGCAGGGTGG 


2 TOO 


TCCCCGTGQC 


CACGTACTCT 


GTBCTCGGCT 


TRTGGTCCTT 


CCTCCTCCTA 


6TCCTGGCYT 


2760 


TACCACAGCA 


GGCTTATGCC 


TTGGACGCT6 


CTGAACAAGG 


GGAACTG6GG 


CTGGCCATAT 


2820 


TAGTAATTAT 


ATCCATCTTT 


ACTCTTACCC 


CAGCATACAA 


GATCCTCCTG 


AGCCGTTCAG 


2880 


TGTGGTGGCT 


GTCCTACATG 


CTGGTCTTGG 


CCGA6GCCCA 


GATTCAGCAA 


TGGGTTCCCC 


2940 
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CCCTGGA6GT 


yVAAA yVAAAAA 

CCGAGG6G6G 


A AT A A AAA A A 

CGTGACGGGA 


TA A TAT AAA T 

TCATCTGGGT 


A AATA T A ATT 

GGCTGTCAl 1 


AT AAAAAAAA 

CI ACACCCAC 


0 AAA 


/\/\/\TT/\T/\TT 

6CCTTGTGTT 


XA A AAXAA AA 

T6AGGTCACG 


AAA TAATTAT 

AAATGGl IGI 


T A AA A A TA AT 

1 AGCAAICCI 


A AAAA AT AAA 

GGGGCCIGIL 


TA AATr^r^TTA 

IALU!(/OI lA 


oOov) 


RAGCGTCTCT 


AAX A AAA ATA 

GCTACGGATA 


A AAT A ATTT A 

CCGTACI I IG 


TA A AAAAAA A 

IGAGGGCCCA 


AA/^TTT/^CT A 

CGCI 1 lull A 


AA AATOTrTA 

V/bAbluiul A 


•3 10 A 
0\C\J 


CCCTG6TGAA 


A A * A/\T/VA AO 

ACACCTCGCK 


AAA AAT A AAT 

GGGGC 1 AGG 1 


A A ATAA A A A T 

ACAICCAGAI 


/^AT^TTD A T/^ 
uC 1 (l 1 1 KA 1 (/ 


AOlrAIAuULA 


0 1 OA 


n A T /\/\ k /\/\/\/\ 

GATGGACCGG 


AAAXXAAA TA 

CACTTACATC 


T AAA A A A AAA 

TAC6ACCACC 


TATA A A ATTT 

TCTCCCCI 1 1 


A TA A A ATTAA 

AlCAACI IGG 


r^/^rr^/^r^/^A 
GCUUUUCAub 




GTTTRCGGGA 


AAXAAA A A XA 

CCTGGCAATC 


AAA AT A A AAA 

GCCGTGGAGC 


AT AT AATATT 

CTGTGGTGI 1 


AAAAAAA ATA 

CAGCCCAA 1 G 


AA AA AAAA AA 

GAGAAuAAuG 


n TAA 

3300 


TCATT6TGTG 


G6GGGCTGAG 


A AA AXAAAAT 

ACAGTGGCGT 


AT A A A A A A A T 

GTGGAGACAT 


A ATAA A TAA A 

CCTGCATGGC 


AT AAAA AT AT 

CICCCGGTCI 


3360 


CCGCGAGGCT 


AGGTAG6GAR 


ATTAXAATAA 

GTTCTGCTCG 


AAAATAAAA A 

GCCCTGCCGA 


AAA AT A A A A A 

CGGCTACACC 


TAAA AAAAAT 

TCCAAGGGGT 


3420 


AAA A ■# A Y A A 

GGAAKCTCCT 


A A A ^ A A A A T T 

A6CTCCCATT 


A AT A A XT AAA 

ACTGCTTACA 


ATAAAAA A AA 

CTCAGCAAAC 


T AA T A AT AT A 

TCGTGGTCTC 


AT A A AT A A T A 

CTGGGTGCTA 


3480 


TCGTGGTCAG 


AATA AAAAAA 

CCTAACGGGC 


AAAAAAA A A A 

CGCGACAAAA 


A T A A A A A AAA 

ATGAGCAGGC 


TAAAA A AATA 

TGGGCAGGTC 


A A AATT AT AT 

CAGGTTCTGT 


3540 


A A Y A A A "Y" AAA 

CCTCCGTCAC 


A A A A A ^ T ^ "IT A 

ACAAACTTTC 


TTAAAAA A A T 

TTGGGGACAT 


AA A TTT AA AA 

CCATTTCGGG 


AAT AA TATA A 

C6TCCTCTGG 


AAAATA TA TA 

ACAGTAIATC 


1 A A A 

3600 


A A A A A A A Y A A 

ACGQGQCTGG 


XA AXA AAA AA 

TAATAA6ACC 


TTAAAAAAAA 

TTGGCCGGCC 


AA A A AAAA AA 

CCAAGGGACC 


AATAAATAAA 

AGTCACTCAG 


ATATAAAAAA 

ATGTACACCA 


1 A/* A 

3660 


AAAAA A A A A A 

GCGCAGAAGG 


GGACCTCGTG 


AAA TAAAAT a 

GGATGGCCTA 


ATAAAAAAAA 

GTCCCCCCG6 


AAATA A A T A A 

GACTAAGTCA 


TT A A A A A A AT 

TTGGACCCCT 


1 TA A 

3720 


GTACCTGCGG 


AAAAAT AA AA 

GGCCGTAGAC 


A TA T A A A T A A 

CTCTACCTGG 


TAAAAAAA A A 

TCACCCGAAA 


AAATA A TAT A 

C6CTGATGTC 


attaaaataa 

ATTCCGGICC 


1 TO A 

3780 


GGAGGAAAGA 


XA A AAA A AAA 

TGACCGACGG 


AAXAA ATT A A 

GGTGCATTAC 


TATAAAAA AA 

TCTC6CCAAG 


aaaaatata a 

GCCCCTCTCA 


A A AA T A AAAA 

ACCCTCAAAG 


0 O il A 

3840 


A A TA* TAAAA 

GATCATCCGG 


AAAAAAAAXA 

AGGGCCCGTG 


A TA TA AT Al 1 A 

CTCTGCTCWA 


AAAA A A AAAA 

GGGGACACGC 


A AT A AAA TT A 

C6TGGGCTTG 


TT A A A A AAAA 

TTCAGAGCGG 


0 AAA 

3900 


aaaxatataa 

CCGT6TGTGC 


AA AAAAXAT A 

CAGGGGTGTA 


AAA AAA TAT A 

GCCAAATCTA 


TT AA ATTA a T 

TTGACTTCAT 


AAAAATAAA A 

CCCCGTCGAA 


TAAATAAA TD 

TCACTCGAl K 


3960 


X AA/\/\ AAA AA 

TCGCCACACG 


A A /\A/\AAA AX 

GACGCCCAGT 


TTATATA AA A 

TTCTCTGACA 


A A AATI^AAAA 

ACAGTRCGCC 


A A A A A A T A T A 

GCCAGCTGTG 


AAAA A A TATT 

CCCCAGICIT 


4020 


A AA A AAXAAA 

ACCAGGTGGG 


XX AAXXAA A A 

TTACTTGCAC 


AAAAAA AAAA 

GCACCAACAG 


AAAAAAAA A A 

GCAGCGGAAA 


AAAAAAAA AA 

GAGCACCAAG 


ATAA AT AAAA 

GTCCCTGCCG 


A AO A 

4080 


A AX 1 X AA A A A 

CGTATGCCAG 


TA A AAA AT A X 

TCAG6GGTAT 


A AAATAATAA 

AAAGTACTCG 


taata a ataa 

TACTAAATCC 


ATATATAAAA 

CTCTGTCGCG 


AAAAAA ATTA 

GCCACACTTG 


A i il A 

4140 


A TT TT AA AA A 

GTTTTGGGGC 


ataaatat a a 

CTACATGTCC 


AAA AAAA AAA 

AAAGCCCACG 


AAATAA AAAA 

GGATCAACCC 


TAATATAAAA 

TAATATCAGA 


A ATAAA ATAA 

ACTG6A6TGC 


A OA A 

4200 


GGACCGTTAC 


CACCG6GGAC 


TCTATCACTT 


ACTCCACTTA 


TGGCAA6TTT 


ATC6CAGATG 


4260 


GAGGCT6TGC 


AGCC6GTGCC 


TATGACATCA 


TCATATGC6A 


CGAATGCCAT 


TCAGTGGACG 


4320 


CTACTACCAT 


CCTTG6CATT 


6GAACAGTCC 


TTGACCAAGC 


TGAGACCGCA 


GGCGTCAGGC 


4380 


TAGTGGTYTT 


GGCCACAGCC 


ACGCCTCCCG 


GTACGGTGAC 


AACTCCCCAC 


AGTAACATAG 


4440 


AGGAGGTGGC 


CCTTG6TCAC 


GAGGGCGAGA 


TCCCTTTTTA 


T66CAAAGCT 


ATTCCCCTAG 


4500 


CTTTCATCAA 


GGGGGGCAGA 


CACTTGATCT 


TTTGCCATTC 


AAAGAAGAAG 


TGCGACGAGC 


4560 


TCGCA6CGGC 


CCTCCGGGGC 


AYGGGTGTCA 


ATGCCGTTGC 


ATACTATAGG 


GGTCTCGACG 


4620 


TCTCCGTTAT 


ACCAACTCAA 


6GAGACGT6G 


TGGTTGTCGC 


CACTGATGCC 


CTAATGACTG 


4680 
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GGTACACCGG 


CGACTTTGAC 


TCYGTCATCG 


ACTTCAGCCT 


AGACCCAACC 


TTCACCATCA 


CCCGTAGTCA 


ACGTAGAGGG 


AGAACTGGGA 


CGTCAGGYGA 


RRGGCCGTCT 


GGGATGTTCG 


CCGGGGCAGC 


CTGGTAC6AG 


CTTACACCTG 


TCAACAC6CC 


CGGTTTGCCC 


GTATGTCAAG 


CAGGTCTCAC 


HCACATTRAC 


GCCCACTTCC 


TTGCRTATCT 


AACGGCCTAC 


CAGGCCACAG 


CGTGGGACGT 


GATGTGGAAG 


TGTCTAACTA 


CCCTCCTGTA 


CC6CTTG6GT 


GCCGTGACCA 


AATACATC6C 


CAC6TGCATG 


CAAGCTGACC 


CGGGGGGGGT 


6CTAGCCGCC 


GTGGCAGCTT 


TTG6CCGCCT 


ACACCTGAAT 


GATCGGGTGG 


AGGCCTTTGA 


TGA6ATGGAA 


GAATGCGCCT 


GGATGGCGGA 


GATGCTCAAA 


TCTAAGATAC 


CTCAAGRCAT 


RCAGCCAGCT 


ATACAGTCAT 


AACACAFGTG 


GAACTTCATC 


AGTGGTATAC 


GAAATCCTGC 


AGTR6CATCA 


ATGATGGCTT 


CCAGCACCAC 


CATCCTCTTG 


AACATCATGG 


CTGCCGGAGC 


CACYGGCTTC 


6TTGTCAGTG 


6CCT6GGTAA 


GATACTG6TG 


6AC6TTTTGG 


TCGTAGCTTT 


TAAGATCATG 


AGCGGCGAGA 


TGCCTGCTAT 


YCT6TCTCCT 


6GTGCGYTGG 


GCCGCCACGT 


CGGTCAGGGA 


GAGGGRGCGG 


CCTCCAGGGG 


AAACCACGff 


GCCCCTACCC 


GTGTRACGCA 


GGTGCTGAGT 


TCACTTACAA 


GGATCACTGA 


AGATTGCCCA 


RTCCCAT6CT 


GGGTTTGTTC 


CATCCTCACA 


GACTTYAAAA 


TGCCCGGCAT 


TCCCTTTATC 


TCTTGCCAGA 



ACTGTAATGT TGCAGTCTCT CAGATTGTTG 4740 
CCACTCAAAC CGTCCCTCAG GACGCTGTCT 4800 
GGGGGCGATT GGGCRTTTAC AGGTATGTTT 4860 
ACAGCGTAGT GCYCTGCGAG TGCTATGATG 4920 
CTGAGACTAC GGT6AGACTC CGGGCYTATT 4980 
ACCACCTGGA GTTCTGGGAA GCGGTCTTTA 5040 
TCTCCCAGAC GAAGCAA6GA 6GA6AAAACT 5100 
TATGCGCCAG GGCAAAGGCC CCTCCTCCTT 5160 
GGCTCAAACC TACACTGACT 6GTCCCACCC 5220 
ATGAGGTYAC CTT6ACGCAC CCCGTGACGA 5280 
TY6AGATCAT GACAAGCTCA TGGGTCCTGG 5340 
ACTGCCT6GC GACT6GCTGC ATTTCCATCA 5400 
TTGTGRCCCC Y6ACAAGGAR ATCTTATATG 5460 
CCAAAGCCGC CCTCATTGAG GAAGGGCAGC 5520 
AAGGCCTCCT ACAACAGGCC ACAAGGCAAG 5580 
CATGGCCCAA GCTTGAACAA TTTTGGGCCA 5640 
AGTACCTAGC AGGACTCTCC ACCCTACCGG 5700 
TTAGCGCCGC GCTGACTAGC CCACTACCCA 5760 
GAGGATGCTT GGCCTCYCAG ATTGCCCCCC 5820 
GTCTAGTGGG GGCGGCCGTC GGAAGCATAG 5880 
CCGGGTACGG CGCAGGCATT TCAGGGGCCC 5940 
AGCCCAC6GT AGAAGACGTT 6TGAATCTCC 6000 
TAGTGGGAGT CATCTGTGCA GCAATYCTGC 6060 
TCCAGTGGAT GAACAGACTG ATCGCCTTCG 6120 
ACTACGTGGT GGAGTCTGAC 6CTTCACAGC 6180 
TTACCAGCTT ACTTA6GAGA CTACATGCCT 6240 
CGGGGTCTTG GCTCCAGGAC ATTTGGGATT 6300"' 
ACTG6CTGTC TTCAAAATTA CTCCCCAAGA 6360 
AGG6ATACAA GGGTGTATGG GCTGGTACGG 6420 
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GTGTCATGAC 


YACTCGRTRC 


CCATGTGGAG 


CAAACATCTC 


GGGCCATGTC 


CGCATGGGCA 


046 •/ 


CCATGAAAAT 


AACAGGCCCG 


AAGACTTGCT 


TGAACCTGTG 


GCAGGGGACT 


TTCCCCATTA 


554-.; 


ATTGTTACAC 


AGAAGGGCCY 


TGCGTGCCAA 


AACCCCCTCC 


TAATTACAAG 


ACC6CAATTT 


3600 


GGAGGGTGGC 


AGCGTCGGAG 


TACGTTGAGG 


TCACACA6CA 


TGGCTCTTTC 


TCGTATGTAA 


5660 


CR6GGTTAAC 


CAGT6ACAAC 


CTTAAGGTYC 


CTTGCCAGGT 


ACCAGCTCCA 


GAATTTTTCT 


6720 


CTTGG6T66A 


CGGGGTGCAA 


ATCCACCGAT 


TCGCCCCCGT 


WCCAGQTCCC 


TTCTTTCGGG 


6780 


ATGA6GTAAC 


GTTCACCGTA 


GGCCTTAACT 


CCTTCGTGGT 


CGGCTCTCAG 


CTCCCTTGCG 


6840 


ATCCT6AGCC 


GGACACCGAR 


GTACTG6CCT 


CYAT6TTGAC 


AGACCCGTCC 


CACATCACCG 


690C 


CKGAGGCGGC 


AGCCA6GC6A 


TTGGCAAGGG 


GATCTCCCCC 


YTCACAG6CT 


AGCTCCTCAG 


6960 


CGAGCCA6CT 


CTCTGCCCCG 


TCCTTGAAGG 


CTACCTGTAC 


CACCCATAAG 


ACAGCATATG 


7020 


ATTGTGACAT 


GGTGGATGCY 


AACCTTTTCA 


TGGGAGGHGA 


TGTGAYCCGG 


ATTGAGTCTG 


7080 


ACTCTAAGGT 


GATCGTTCTA 


GACTCCCTCQ 


ATTCCATGAC 


TGAGGTAGAG 


GATGATCGTG 


7140 


AGCCTTCTGT 


ACCATCAGAG 


TACCTGATCA 


AGAGGAGAAA 


GTTCCCACCG 


GCGCTGCCTC 


7200 


CTTGGGCCCG 


TCCAGACTAC 


AATCCTGTTT 


TGATC6AGAC 


ATGGAAGAGG 


CCGGGCTATG 


7260 


AACCACCCAC 


TGTCCTAGGC 


TGTGCCCTCC 


CCCCCACACY 


TCAAACGCCA 


GTGCCTCCAC 


7320 


CTCGGAGGCG 


CCGCGCYAAA 


RTCCTGACCC 


AGGACRATGT 


GGAGGGGRTC 


CTCAGGGAGA 


7380 


TGGCTGACAA 


AGTRCTCAGC 


CCTCTCCAAG 


ACAACAATGA 


CTCCGGTCAC 


TCCACTGGAG 


7440 


C6GATACCGG 


AGGAGACATC 


GTCCAGCAAC 


CCTCTGACGA 


GACTGCCGCT 


TCAGAAGCGG 


7500 


GGTCACTGTC 


CTCCATGCCT 


CCCCTT6A6G 


GAGAGCCGGG 


AGACCCYGAC 


CTGGAGTTTG 


7560 


AACCAGT6GG 


ATCCGCTCCC 


CCTTCTGAGG 


GGGAGTGTGA 


GGTCATTGAT 


TCGGACTCTA 


7620 


AGTCGTGGTC 


CACAGTCTCT 


GATCAAGAGG 


ATTCTGTTAT 


CTGCTGCTCT 


ATGTCATACT 


7680 


CCTGGACGGG 


GGCCCTCATA 


ACACCAT6TG 


GGCCCGAAGA 


GGAGAAGTTA 


CCGATCAACC 


7740 


CTCTGAGTAA 


TTCGCTCATG 


CGGTTCCATA 


AYAAGGTGTA 


CTCCACAACC 


TCGAGGAGTG 


7800 


CCTCTCTGA6 


GGCAAAGAAG 


GTGACTTTTG 


ACAGGGTGCA 


GGTGCTGGAC 


GCACACTATG 


7860 


ACTCAGTCTT 


GCAG6ACGTT 


AAGCGGGCCG 


CCTCTAA6GT 


TRGTGCGAGG 


CTCCTCACAG 


7920 


TAGAGGAAGC 


CTGCGCGCTG 


ACCCCGCCCC 


ACTCCGCCAA 


ATCGCGATAC 


6GATTTGGGG 


7980 


CAAAAGAG6T 


GCGCAGCTTA 


TCCAGGAGGG 


CCGTTAACCA 


CATCCGGTCC 


GTGTGGGAGG 


8040 


ACCTCCTG6A 


AGACCAACRT 


ACCCCAATTG 


ACACAACTAT 


CATGGCTAAA 


AATGAGGTGT 


8100 


TCTGCATTGA 


TCCAACTAAR 


GGTGGGAAAA 


AGCCAGCTCG 


CCTCATCGTA 


TACCCCGACC 


8160 
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TTGGGGTCAG 


GGTGTGCGAA 


AAGATGGCCC 

nr\Un 1 VlU\#WW 


TCTATGACAT 


CRCACAAAAG 


CTTCCCAAAG 


8220 


CGATAATGGG 


GCCATCCTAT 


GGGTTCCAAT 


ACTCTCCCGC 


AGAACG6GTC 


GATTTCCTCC 


8280 


TCAAAGCTT6 


GGGAAGTAAG 


AAGGACCCAA 


TGGGGTTCTC 


GTATGACACC 


CGCTGCTTTG 


8340 


ACTCAACCGT 


CACGGAGAGG 


GACATAA6AA 


CAGAAGAATC 


CATATATCA6 


GCTTGTTCTC 


8400 


TGCCTCAAGA 


AGCCA6AACT 


GTCATACACT 


CGCTCACTGA 


GAGACTTTAC 


GTAGGAGGGC 


8460 


CCATGACAAA 

v\/r\ 1 Uf\vr»Ar% 


CAGCAAA6GG 


CAATCCTGCG 


GCTACAGGCG 


TTGCCGCGCA 


AGCGGKGTTT 


8520 


TCAorArrAG 


CATGGGGAAT 

V/fV 1 uuvivino 1 


ACCATGACAT 


6TTACATCAA 


AGCCCTTGCA 


GCGTGTAAGG 


8580 


CTGCRGGGAT 


CGTGGACCCT 


6TTATGTTGG 

iniui I vtvi 


T6TGTGGAGA 


CGACCTGGTC 


GTCATCTCAG 


8640 


AGACrPAAGG 


TAACGAGGAG 


GACGA6CGAA 


ACCTGAGAGC 

(iWi virivir»«Jw 


TTTCACGGAG 


GCTATGACCA 


8700 


GGTATTCCGC 


CCCTCCCGGT 


GACCTTCCCA 


GACCGGAATA 

Vll» W l\0 1 ii 


TGACTTGGA6 


CTTATAACAT 


8760 


CCTGCTCCTC 


AAACGTATCG 

r\i« o V/ \A 1 rv 1 vvt 


6TAGCGCTGG 

VI 1 rivivvtv 1 vtu 


ACTCTCGGGG 


TCGCCGCCGG 

1 V vi VI 


TACTTCCTAA 


8820 


CCAGAGACCC 


TACCACTCCA 


ATCACCCGAG 


CTGCTTGGGA 


AACA6TAAGA 


CACTCCCCTG 


8880 


TAJ ATT/^TT/* 


(](, 1 bbtiLAAl/ 


A i (#A 1 (/UAu 1 


AUuUOl>(/(/AO 


AA 1 V/ 1 uuu 1 1 


LuuAluull/A 




TAATGACTCA 


CTTCTTCTCC 


ATACTATTGG 


CCCAGGACAC 


TCTGAACCAA 


AATCTCAATT 


9000 


TTGAGATGTA 


CGGGGCAGTA 


TACTCGGTCA 


ATCCATTAGA 


CCTACCGGCC 


ATAATT6AAA 


9060 


6GCTACATGG 


GCTTGAAGCC 


TTTTCACTGC 


ACACATACTC 


TCCCCACGAA 


CTCTCACGGG 


9120 


TGGCAGCAAC 


TCTCAGAAAA 


CTTGGAGCGC 


CTCCCCTTAG 


AGCGTGGAAG 


AGTCGGGCGC 


9180 


6T6CCGTGA6 


AGCTTCACTC 


ATCGCCCAAQ 


GAGCGAGGGC 


GGCCATTT6T 


GGCCGCTACC 


9240 


TCTTCAACTG 


GGCGGTGAAA 


ACAAAGCTCA 


AACTCACTCC 


ATTGCCCGAG 


GCGAGCCGCC 


9300 


TGGATTTATC 


CGGGTGGTTC 


ACCGTGQGCG 


CCGGCGGGGG 


CGACATTTAT 


CACAGCGTGT 


9360 


CGCATGCYCG 


ACCCCGCCTA 


TTACTCCTTT 


6CCTACTCCT 


ACTTAGCGTA 


6GA6TA6GCA 


9420 


TCTTTTTACT 


CCCCGCTCGG 


TAGAGCGGCA 


AACYCTAGCT 


ACACTCCATA 


GCTAGTTTCC 


9480 


GTTTTTTTTT 


TTTTTTTTTT 


TTTTTTTTTT 


T 9511 
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Sequence ID No. "B 
Sequence Length: 3,033 
Sequence Type: amino acid 
Topology: linear 
Molecule Type: protein 



Het Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr 
5 10 15 

Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie 
20 25 30 

Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly 
35 40 45 

Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly 
50 55 60 

Arg Arg Gin Pro lie Pro Lys Asp Arg Arg Ser Thr Gly Lys Ser 
65 70 75 

Trp Gly Lys Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly 
80 85 90 

Cys Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro 
95 100 105 

Thr Trp Gly Pro Thr Asp Pro Arg His Arg Ser Arg Asn Leu Gly 
110 115 120 

Arg Val He Asp Thr He Thr Cys Gly Phe Ala Asp Leu Het Gly 
125 130 135 

Tyr He Pro Val Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala 
140 145 150 

Leu Ala His Gly Val Arg Val Leu Glu Asp Gly He Asn Tyr Ala 
155 160 165 
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Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala 

170 1 75 1 80 

Leu Leu Ser Cys Val Thr Val Pro Val Ser Ala Val Glu Val Arg 

185 1 90 195 

Asn He Ser Ser Ser Tyr Tyr Ala Thr Asn Asp Cys Ser Asn Asn 

200 205 210 

Ser He Thr Trp Gin Leu Thr Asp Ala Val Leu His Leu Pro Gly 

215 220 225 

Cys Val Pro Cys Glu Asn Asp Asn Gly Thr Leu His Cys Trp He 

230 235 240 

Gin Val Thr Pro Asn Val Ala Val Lys His Arg Gly Ala Leu Thr 

245 250 255 

Arg Ser Leu Arg Thr His Val Asp Het He Val Het Ala Ala Thr 

260 265 270 

Ala Cys Ser Ala Leu Tyr Val Gly Asp Val Cys Gly Ala Val Het 

275 280 285 

He Leu Ser Gin Ala Phe Het Val Ser Pro Gin Arg His Asn Phe 

290 295 300 

Thr Gin Glu Cys Asn Cys Ser He Tyr Gin Gly His He Thr Gly 

305 310 315 

His Arg Met Ala Trp Asp Het Het Leu Ser Trp Ser Pro Thr Leu 

320 325 330 

Thr Het He Leu Ala Tyr Ala Ala Arg Val Pro Glu Leu Val Leu 

335 340 345 

Glu He He Phe Gly Gly His Trp Gly Val Val Phe Gly Leu Ala 

350 355 360 

Tyr Phe Ser Het Gin Gly Ala Trp Ala Lys Val He Ala He Leu 

365 370 375 

Leu Leu Val Ala Gly Val Asp Ala Thr Thr Tyr Ser Ser Gly Gin 
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Glu Ala Qly 
Gly Ala Lys 
His He Asn 
Gly Phe Leu 
Gly Cys Pro 
Arg lie Gly 
Asp Gly Asp 
Cys Gly lie 
Phe Thr Pro 
Val Pro Thr 
Leu Asn Ser 
Trp Met Asn 
Cys Arg He 
Pro Thr Asp 



380 
Arg Thr Val 

395 
Gin Asn Leu 

410 
Arg Thr Ala 

425 
Ala Ser Leu 

440 
Glu Arg Leu 

455 
Trp Gly Thr 

470 
Met Arg Pro 

485 
Val Pro Ala 

500 
Ser Pro Vai 

515 
Tyr Thr Trp 

530 
Thr Arg Pro 

545 
Gly Thr Gly 

560 
Arg Lys Asp 

575 
Cys Phe Arg 

590 



Ala Gly Phe 
Tyr Leu He 
Leu Asn Cys 
Phe Tyr Thr 
Ser Ser Cys 
Leu Glu Tyr 
Tyr Cys Trp 
Arg Thr Val 
Val Val Gly 
Gly Glu Asn 
Pro Arg Gly 
Phe Thr Lys 
Tyr Asn Ser 
Lys His Pro 



385 

Ala Gly Leu 
400 

Asn Thr Asn 
415 

Asn Asp Ser 
430 

His Lys Phe 
445 

Arg Gly Leu 
460 

Glu Thr Asn 
475 

His Tyr Pro 
490 

Cys Gly Pro 
505 

Thr Thr Asp 
520 

Glu Thr Asp 

535 

Ala Trp Phe 
550 

Thr Cys Gly 
565 

Thr He Asp 
580 

Asp Ala Thr 
595 



390 

Phe Thr Thr 
405 

Gly Ser Trp 
420 

Leu Gin Thr 
435 

Asn Ser Ser 
450 

Asp Asp Phe 
465 

Val Thr Asn 
480 

Pro Arg Pro 
495 

Val Tyr Cys 
510 

Lys Gin Gly 
525 

Val Phe Leu 
540 

Gly Cys Thr 
555 

Ala Pro Pro 
570 

Leu Leu Cys 
585 

Tyr Leu Lys 
600 
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Cys Gly Ala Gly Pro Trp Leu Thr Pro Arg Cys Leu Val Asp Tyr 

605 610 615 

Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn Phe Thr He 

620 625 630 

Phe Lys Ala Arg Met Tyr Val Gly Gly Val Giu His Arg Phe Ser 

635 640 645 

Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys Arg Leu Glu Asp 

650 655 660 

Arg Asp Arg Gly Gin Gin Ser Pro Leu Leu His Ser Thr Thr Glu 

665 670 675 

Trp Ala Val Leu Pro Cys Ser Phe Ser Asp Leu Pro Ala Leu Ser 

680 685 690 

Thr Gly Leu Leu His Leu His Gin Asn lie Val Asp Val Gin Tyr 

695 700 705 

Leu Tyr Gly Leu Ser Pro Ala Leu Thr Arg Tyr He Val Lys Trp 

710 715 720 

Glu Trp Val He Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg He 

725 730 735 

Cys Ala Cys Leu Trp Het Leu He He Leu Gly Gin Ala Glu Ala 

740 745 750 

Ala Leu Glu Lys Leu He He Leu His Ser Ala Ser Ala Ala Ser 

755 760 765 

Ala Asn Gly Pro Leu Trp Phe Phe He Phe Phe Thr Ala Ala Trp 

770 775 780 

Tyr Leu Lys Gly Arg Val Val Pro Val Ala Thr Tyr Ser Val Leu 

785 790 795 

Gly Leu Trp Ser Phe Leu Leu Leu Val Leu Ala Leu Pro Gin Gin 

800 805 810 

Ala Tyr Ala Leu Asp Ala Ala Glu Gin Gly Glu Leu Gly Leu Ala 
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815 820 825 

He Leu Val He He Ser He Phe Thr Leu Thr Pro Ala Tyr Lys 

830 835 840 

He Leu Leu Ser Arg Ser Val Trp Trp Leu Ser Tyr Met Leu Val 

845 850 855 

Leu Ala Qlu Ala Gin He Gin Gin Trp Val Pro Pro Leu Glu Val 

860 865 870 

Arg Gly Gly Arg Asp Gly He He Trp Val Ala Val He Leu His 

875 880 885 

Pro Arg Leu Val Phe Glu Val Thr Lys Trp Leu Leu Ala He Leu 

890 895 900 

Gly Pro Ala Tyr Leu Leu Lys Ala Ser Leu Leu Arg He Pro Tyr 

905 910 915 

Phe Val Arg Ala His Ala Leu Leu Arg Val Cys Thr Leu Val Lys 

920 925 930 

His Leu Ala Gly Ala Arg Tyr He Gin Met Leu Leu He Thr He 

935 940 945 

Gly Arg Trp Thr Gly Thr Tyr He Tyr Asp His Leu Ser Pro Leu 

950 955 960 

Ser Thr Trp Ala Ala Gin Gly Leu Arg Asp Leu Ala He Ala Val 

965 970 975 

Glu Pro Val Val Phe Ser Pro Met Glu Lys Lys Val He Val Trp 

980 985 990 

Gly Ala Glu Thr Val Ala Cys Gly Asp He Leu His Gly Leu Pro 

995 1000 1005 

Val Ser Ala Arg Leu Gly Arg Glu Val Leu Leu Gly Pro Ala Asp 

1010 1015 1020 

Gly Tyr Thr Ser Lys Gly Trp Lys Leu Leu Ala Pro He Thr Ala 

1025 1030 1035 
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Tyr Thr Gin Gin Thr Arg Gly Leu Leu Gly Ala He Val Val Ser 

1040 1045 1050 

Leu Thr Gly Arg Asp Lys Asn Glu Gin Ala 'Gly Gin Val Gin Val 

1055 1060 1065 

Leu Ser Ser Val Thr Gin Thr Phe Leu Gly Thr Ser He Ser Gly 

1070 1075 1080 

Val Leu Trp Thr Val Tyr His Gly Ala Gly Asn Lys Thr Leu Ala 

1085 1090 1095 

Gly Pro Lys Gly Pro Val Thr Gin Met Tyr Thr Ser Ala Glu Gly 

1100 1105 1110 

Asp Leu Val Gly Trp Pro Ser Pro Pro Gly Thr Lys Ser Leu Asp 

1115 1120 1125 

Pro Cys Thr Cys Gly Ala Val Asp Leu Tyr Leu Val Thr Arg Asn 

1130 1135 1140 

Ala Asp Val He Pro Val Arg Arg Lys Asp Asp Arg Arg Gly Ala 

1145 1150 1155 

Leu Leu Ser Pro Arg Pro Leu Ser Thr Leu Lys Gly Ser Ser Gly 

1160 1165 1170 

Gly Pro Val Leu Cys Ser Arg Gly His Ala Val Gly Leu Phe Arg 

1175 1180 1185 

Ala Ala Val Cys Ala Arg Gly Val Ala Lys Ser He Asp Phe He 

1190 1195 1200 

Pro Val Glu Ser Leu Asp Val Ala Thr Arg Thr Pro Ser Phe Ser 

1205 1210 1215 

Asp Asn Ser Thr Pro Pro Ala Val Pro Gin Ser Tyr Gin Val Gly 

1220 1225 1230 

Tyr Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 

1235 1240 1245 

Ala Ala Tyr Ala Ser Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
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1250 1255 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Het 

1265 1270 

His Gly He Asn Pro Asn lie Arg Thr Gly Val Arg 

1280 1285 

Thr Gly Asp Ser He Thr Tyr Ser Thr Tyr Gly Lys 

1295 1300 

Asp Gly Gly Cys Ala Ala Gly Ala Tyr Asp He He 

1310 1315 

Glu Cys His Ser Val Asp Ala Thr Thr He Leu Gly 

1325 1330 

Val Leu Asp Gin Ala Glu Thr Ala Gly Val Arg Leu 

1340 1345 

Ala Thr Ala Thr Pro Pro Gly Thr Val Thr Thr Pro 

1355 1360 

He Glu Glu Val Ala Leu Gly His Glu Gly Glu He 

1370 1375 

Gly Lys Ala He Pro Leu Ala Phe He Lys Gly Gly 

1385 1390 

He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu 

1400 1405 

Leu Arg Gly Het Gly Val Asn Ala Val Ala Tyr Tyr 

1415 1420 

Asp Val Ser Val He Pro Thr Gin Gly Asp Val Val 

1430 1435 

Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe 

1445 1450 

He Asp Cys Asn Val Ala Val Ser Gin He Val Asp 

1460 1465 



1260 
Ser Lys Ala 

1275 
Thr Val Thr 

1290 
Phe He Ala 

1305 

He Cys Asp 
1320 

He Gly Thr 
1335 

Val Val Leu 
1350 

His Ser Asn 
1365 

Pro Phe Tyr 
1380 

Arg His Leu 
1395 

Ala Ala Ala 
1410 

Arg Gly Leu 
1425 

Val Val Ala 
1440 

Asp Ser Val 
1455 

Phe Ser Leu 
1470 
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Asp Pro Thr Phe Thr He Thr Thr Gin Thr Val Pro Gin Asp Ala 

1475 1480 1485 

Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Giy Arg Gly Arg Leu 

1490 1495 1500 

Gly Val Tyr Arg Tyr Val Ser Ser Gly Glu Arg Pro Ser Gly Het 

1505 1510 1515 

Phe Asp Ser Val Val Leu Cys Glu Cys Tyr Asp Ala Gly Ala Ala 

1520 1525 1530 

Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala 

1535 1540 1545 

Tyr Phe Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu 

1550 1555 1560 

Phe Trp Glu Ala Val Phe Thr Gly Leu Thr His He Asp Ala His 

1565 1570 1575 

Phe Leu Ser Gin Thr Lys Gin Gly Gly Glu Asn Phe Ala Tyr Leu 

1580 1585 1590 

Thr Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Lys Ala Pro Pro 

1595 1500 1605 

Pro Ser Trp Asp Val Het Trp Lys Cys Leu Thr Arg Leu Lys Pro 

1610 1615 1620 

Thr Leu Thr Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 

1625 1630 1635 

Thr Asn Glu Val Thr Leu Thr His Pro Val Thr Lys Tyr He Ala 

1640 1645 1650 

Thr Cys Het Gin Ala Asp Leu Glu He Het Thr Ser Ser Trp Val 

1655 1660 1665 

Leu Ala Gly Gly Val Leu Ala Ala Val Ala Ala Tyr Cys Leu Ala 

1670 1675 1680 

Thr Gly Cys He Ser He He Gly Arg Leu His Leu Asn Asp Arg 
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1685 1690 1695 

Val Val Val Ala Pro Asp Lys Glu He Leu Tyr Glu Ala Phe Asp 

1700 1705 1710 

Glu Met Glu Glu Cys Ala Sen Lys Ala Ala Leu tie Glu Glu Gly 

1715 1720 1725 

Gin Arg Het Ala Glu Het Leu Lys Ser Lys lie Gin Gly Leu Leu 

1730 1735 1740 

Gin Gin Ala Thr Arg Gin Ala Gin Asp He Gin Pro Ala He Gin 

1745 1750 1755 

Ser Ser Trp Pro Lys Leu Glu Gin Phe Trp Ala Lys His Het Trp 

1760 1765 1770 

Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu 

1775 1780 1785 

Pro Gly Asn Pro Ala Val Ala Ser Het Het Ala Phe Ser Ala Ala 

1790 1795 1800 

Leu Thr Ser Pro Leu Pro Thr Ser Thr Thr He Leu Leu Asn He 

1805 1810 1815 

Het Gly Gly Trp Leu Ala Ser Gin He Ala Pro Pro Ala Gly Ala 

1820 1825 1830 

Thr Gly Phe Val Val Ser Gly Leu Val Gly Ala Ala Val Gly Ser 

1835 1840 1845 

He Gly Leu Gly Lys He Leu Val Asp Val Leu Ala Gly Tyr Gly 

1850 1855 1860 

Ala Gly He Ser Gly Ala Leu Val Ala Phe Lys He Het Ser Gly 

1865 1870 1875 

Glu Lys Pro Thr Val Glu Asp Val Val Asn Leu Leu Pro Ala He 

1880 1885 1890 

Leu Ser Pro Gly Ala Leu Val Val Gly Val He Cys Ala Ala He 

1895 1900 1905 
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Leu Arc Aro His Val Gly Gin Gly Glu Gly Ala Val Gin Trp Het 

1910 1915 1920 

Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val Ala Pro 

1925 1930 1935 

Thr His Tyr Val Val Glu Ser Asp Ala Ser Gin Arg Val Thr Gin 

1940 1945 1950 

Val Leu Ser Ser Leu Thr He Thr Ser Leu Leu Arg Arg Leu His 

1955 I960 1965 

Ala Trp He Thr Glu Asp Cys Pro Val Pro Cys Ser Gly Ser Trp 

1970 1975 1980 

Leu Gin Asp He Trp Asp Trp Val Cys Ser He Leu Thr Asp Phe 

1985 1990 1995 

Lys Asn Trp Leu Ser Ser Lys Leu Leu Pro Lys Het Pro Gly He 

2000 2005 2010 

Pro Phe He Ser Cys Gin Lys Gly Tyr Lys Gly Val Trp Ala Gly 

2015 2020 2025 

Thr Gly Val Het Thr Thr Arg Cys Pro Cys Gly Ala Asn He Ser 

2030 2035 2040 

Gly His Val Arg Het Gly Thr Het Lys He Thr Gly Pro Lys Thr 

2045 2050 2055 

Cys Leu Asn Leu Trp Gin Gly Thr Phe Pro He Asn Cys Tyr Thr 

2060 2065 2070 

Glu Gly Pro Cys Val Pro Lys Pro Pro Pro Asn Tyr Lys Thr Ala 

2075 2080 2085 

He Trp Arg Val Ala Ala Ser Glu Tyr Val Glu Val Thr Gin His 

2090 2095 2100 

Gly Ser Phe Ser Tyr Val Thr Gly Leu Thr Ser Asp Asn Leu Lys 

2105 2110 2115 

Val Pro Cys Gin Val Pro Ala Pro Glu Phe Phe Ser Trp Val Asp 
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2120 2125 2130 

Gly Val Gin He His Arg Phe Ala Pro Val Pro Gly Pro Phe Phe 

2135 2140 2U5 

Arg Asp Glu Val Thr Phe Thr Val Gly Leu Asn Ser Phe Val Val 

2150 2155 2160 

Gly Ser Gh Leu Pro Cys Asp Pro Glu Pro Asp Thr Glu Val Leu 

2165 2170 2175 

Ala Ser Het Leu Thr Asp Pro Ser His He Thr Ala Glu Ala Ala 

2180 2185 2190 

Ala Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Gin Ala Ser Ser 

2195 2200 2205 

Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 

2210 2215 2220 

Thr His Lys Thr Ala Tyr Asp Cys Asp Het Val Asp Ala Asn Leu 

2225 2230 2235 

Phe Het Gly Gly Asp Val Thr Arg lie Glu Ser Asp Ser Lys Val 

2240 2245 2250 

He Val Leu Asp Ser Leu Asp Ser Het Thr Glu Val Glu Asp Asp 

2255 2260 2265 

Arg Glu Pro Ser Val Pro Ser Glu Tyr Leu He Lys Arg Arg Lys 

2270 2275 2280 

Phe Pro Pro Ala Leu Pro Pro Trp Ala Arg Pro Asp Tyr Asn Pro 

2285 2290 2295 

Val Leu He Glu Thr Trp Lys Arg Pro Gly Tyr Glu Pro Pro Thr 

2300 2305 2310 

Val Leu Gly Cys Ala Leu Pro Pro Thr Pro Gin Thr Pro Val Pro 

2315 2320 2325 

Pro Pro Arg Arg Arg Arg Ala Lys Val Leu Thr Gin Asp Asn Val 

2330 2335 2340 
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Glu fily Val Leu Arg Glu Het Ala Asp Lys Val Leu Ser Pro Leu 

2345 2350 2355 

Gin Asp Asn Asn Asp Ser Gly His Ser Thr Gly Ala Asp Thr Gly 

2360 2365 2370 

Gly Asp He Val Gin Gin Pro Ser Asp Glu Thr Ala Ala Ser Glu 

2375 2380 2385 

Ala Gly Ser Leu Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly 

2390 2395 2400 

Asp Pro Asp Leu Glu Phe Glu Pro Val Gly Ser Ala Pro Pro Ser 

2405 2410 2415 

Glu Gly Glu Cys Glu Val He Asp Ser Asp Ser Lys Ser Trp Ser 

2420 2425 2430 

Thr Val Ser Asp Gin Glu Asp Ser Val He Cys Cys Ser Het Ser 

2435 2440 2445 

Tyr Ser Trp Thr Gly Ala Leu He Thr Pro Cys Gly Pro Glu Glu 

2450 2455 2460 

Glu Lys Leu Pro He Asn Pro Leu Ser Asn Ser Leu Het Arg Phe 

2465 2470 2475 

His Asn Lys Val Tyr Ser Thr Thr Ser Arg Ser Ala Ser Leu Arg 

2480 2485 2490 

Ala Lys Lys Val Thr Phe Asp Arg Val Gin Val Leu Asp Ala His 

2495 2500 2505 

Tyr Asp Ser Val Leu Gin Asp Val Lys Arg Ala Ala Ser Lys Val 

2510 2515 2520 

Ser Ala Arg Leu Leu Thr Val Glu Glu Ala Cys Ala Leu Thr Pro 

2525 2530 2535 

Pro His Ser Ala Lys Ser Arg Tyr Gly Phe Gly Ala Lys Glu Val 

2540 2545 2550 

Arg Ser Leu Ser Arg Arg Ala Val Asn His He Arg Ser Val Trp 
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2555 2560 2565 

Glu Asn Leu Leu Glu Asp Gin His Thr Pro lie Asp Thr Thr He 

2570 2575 2580 

Het Ala Lys Asn Glu Val Phe Cys He Asp Pro Thr Lys Gly Gly 

2585 2590 2595 

Lys Lys Pro Ala Arg Leu He Val Tyr Pro Asp Leu Gly Val Arg 

2600 2605 2610 

Val Cys Glu Lys Het Ala Leu Tyr Asp He Ala Gin Lys Leu Pro 

2615 2620 2625 

Lys Ala He Het Gly Pro Ser Tyr Gly Phe Gin Tyr Ser Pro Ala 

2630 2635 2640 

Glu Arg Val Asp Phe Leu Leu Lys Ala Trp Gly Ser Lys Lys Asp 

2645 2650 2655 

Pro Het Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 

2660 2665 2670 

Thr Glu Arg Asp He Arg Thr Glu Glu Ser lie Tyr Gin Ala Cys 

2675 2680 2685 

Ser Leu Pro Gin Glu Ala Arg Thr Val He His Ser Leu Thr Glu 

2690 2695 2700 

Arg Leu Tyr Val Gly Gly Pro Het Thr Asn Ser Lys Gly Gin Ser 

2705 2710 2715 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser 

2720 2725 2730 

Het Gly Asn Thr Het Thr Cys Tyr He Lys Ala Leu Ala Ala Cys 

2735 2740 2745 

Lys Ala Ala Gly He Val Asp Pro Val Het Leu Val Cys Gly Asp 

2750 2755 2760 

Asp Leu Val Val He Ser Glu Ser Gin Gly Asn Glu Glu Asp Glu 

2765 2770 2775 
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Arg Asn Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala 

2780 2785 2790 

Pro Pro Gly Asp Leu Pro Arg Pro Glu Tyr Asp Leu Glu Leu He 

2795 2800 2805 

Thr Ser Cys Ser Ser Asn Val Ser Val Ala Leu Asp Ser Arg Gly 

2810 2815 2820 

Arg Arg Arg Tyr Phe Leu Thr Arg Asp Pro Thr Thr Pro He Thr 

2825 2830 2835 

Arg Ala Ala Trp Glu Thr Val Arg His Ser Pro Val Asn Ser Trp 

2840 2845 2850 

Leu Gly Asn He He Gin Tyr Ala Pro Thr He Trp Val Arg Met 

2855 2860 2865 

Val He Met Thr His Phe Phe Ser He Leu Leu Ala Gin Asp Thr 

2870 2875 2880 

Leu Asn Gin Asn Leu Asn Phe Glu Het Tyr Gly Ala Val Tyr Ser 

2885 2890 2895 

Val Asn Pro Leu Asp Leu Pro Ala He He Glu Arg Leu His Gly 

2900 2905 2910 

Leu Glu Ala Phe Ser Leu His Thr Tyr Ser Pro His Glu Leu Ser 

2915 2920 2925 

Arg Val Ala Ala Thr Leu Arg Lys Leu Gly Ala Pro Pro Leu Arg 

2930 2935 2940 

Ala Trp Lys Ser Arg Ala Arg Ala Val Arg Ala Ser Leu He Ala 

2945 2950 2955 

Gin Gly Ala Arg Ala Ala He Cys Gly Arg Tyr Leu Phe Asn Trp 

2960 2965 2970 

Ala Val Lys Thr Lys Leu Lys Leu Thr Pro Leu Pro Glu Ala Ser 

2975 2980 2985 

Arg Leu Asp Leu Ser Gly Trp Phe Thr Val Gly Ala Gly Gly Gly 
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2990 2995 3000 

Asp He Tyr His Ser Val Ser His Ala Arg Pro Arg Leu Leu Leu 

3005 3010 3015 

Leu Cys Leu Leu Leu Leu Ser Val Gly Val Gly He Phe Leu Leu 

3020 3025 3030 

Pro Ala Arg 
3033 
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Sequence ID No. 
Sequence Length: 3,033 
Sequence Type: amino acid 
Topology: linear 
Molecule Type: protein 



Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr 

5 10 15 

Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie 

20 25 30 

Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly 

35 40 45 

Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly 

50 55 60 

Arg Arg Gin Pro He Pro Lys Asp Arg Arg Ser Thr Gly Lys Ser 

S5 70 75 

Trp Gly Lys Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly 

80 85 90 

Cys Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro 

95 100 105 

Thr Trp Gly Pro Thr Asp Pro Arg His Arg Ser Arg Asn Leu Gly 

110 115 120 

Arg Val He Asp Thr He Thr Cys Gly Phe Ala Asp Leu Het Gly 

125 130 135 

Tyr He Pro Val Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala 

140 145 150 

Leu Ala His Gly Val Arg Val Leu Glu Asp Gly He Asn Tyr Ala 

155 160 165 
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Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala 

170 175 180 

Leu Leu Ser Cys Val Thr Het Pro Val Ser Ala Val Glu Val Arg 

185 190 195 

Asn He Ser Ser Ser Tyr Tyr Ala Thr Asn Asp Cys Ser Asn Asn 

200 205 210 

Ser He Thr Trp Gin Leu Thr Asp Ala Val Leu His Leu Pro Gly 

215 220 225 

Cys Val Pro Cys Glu Asn Asp Asn Gly Thr Leu Arg Cys Trp He 

230 235 240 

Gin Val Thr Pro Asp Val Ala Val Lys His Arg Gly Ala Leu Thr 

245 250 255 

Arg Ser Leu Arg Thr His Val Asp Het He Val Het Ala Ala Thr 

260 265 270 

Ala Cys Ser Ala Leu Tyr Val Gly Asp Val Cys Gly Ala Val Het 

275 280 285 

He Leu Ser Gin Ala Phe Het Val Ser Pro Gin Arg His Asn Phe 

290 295 300 

Thr Gin Glu Cys Asn Cys Ser He Tyr Gin Gly His He Thr Gly 

305 310 315 

His Arg Het Ala Trp Asp Het Het Leu Asn Trp Ser Pro Thr Leu 

320 325 330 

Ala Het He Leu Ala Tyr Ala Ala Arg Val Pro Glu Leu Val Leu 

335 340 345 

Glu He He Phe Gly Gly His Trp Gly Val Ala Phe Gly Leu Gly 

350 355 360 

Tyr Phe Ser Het Gin Gly Ala Trp Ala Lys Val Val Ala He Leu 

365 370 375 

Leu Leu Val Ala Gly Val Asp Ala Ser Thr Tyr Ser Thr Gly Gin 
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Gin Ala Gly 
Gly Ala Lys 
His lie Asn 
Gly Phe He 
Gly Cys Pro 
Arg He Gly 
Asp Glu Asp 
Cys Gly I le 
Phe Thr Pro 
Val Pro Thr 
Leu Asn Ser 
Trp Met Asn 
Cys Arg He 
Pro Thr Asp 



380 
Arg Ala Ala 

395 
Gin Asn Leu 

410 
Arg Thr Ala 

425 
Ala Ser Leu 

440 
Glu Arg Leu 

455 
Trp Gly Thr 

470 
Met Arg Pro 

485 
Val Pro Ala 

500 
Ser Pro Val 

515 
Tyr Thr Trp 

530 
Thr Arg Pro 

545 
Gly Thr Gly 

560 
Arg Lys Asp 

575 
Cys Phe Arg 

590 



Tyr Gly He 
His Leu He 
Leu Asn Cys 
Val Tyr Tyr 
Ser Ser Cys 
Leu Glu Tyr 
Tyr Cys Trp 
Arg Thr Val 
Val Val Gly 
Gly Glu Asn 
Pro Arg Gly 
Phe Thr Lys 
Tyr Asn Ser 
Lys His Pro 



385 

Ser Ser Leu 
400 

Asn Thr Asn 
415 

Asn Asp Ser 
430 

Arg Arg Phe 
445 

Arg Gly Leu 
460 

Glu Thr Asn 
475 

His Tyr Pro 
490 

Cys Gly Pro 
505 

Thr Thr Asp 
520 

Glu Thr Asp 
535 

Ala Trp Phe 
550 

Thr Cys Gly 

565 

Thr He Asp 
580 

Asp Ala Thr 
595 



390 

Phe Asn Thr 
405 

Gly Ser Trp 
420 

Leu Glu Thr 
435 

Asn Ser Ser 
450 

Asp Asp Phe 
465 

Val Thr Asn 
480 

Pro Arg Pro 
495 

Val Tyr Cys 
510 

Lys Gin Gly 
525 

Val Phe Leu 
540 

Gly Cys Thr 
555 

Ala Pro Pro 
570 

Leu Leu Cys 
585 

Tyr Leu Lys 
600 
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Cys Gly Ala Gly Pro Trp Leu Thr Pro Arg Cys Leu Val Asp Tyr 

605 610 615 

Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn Phe Thr He 

620 625 630 

Phe Lys Ala Arg Met Tyr Val Gly Gly Val Glu His Arg Phe Ser 

635 640 645 

Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys Arg Leu Glu Asp 

650 655 660 

Arg Asp Arg Gly Gin Gin Ser Pro Leu Leu His Ser Thr Thr Glu 

665 670 675 

Trp Ala Val Phe Pro Cys Ser Phe Ser Asp Leu Pro Ala Leu Ser 

680 685 690 

Thr Gly Leu Leu His Leu His Gin Asn lie Val Asp Val Gin Tyr 

695 700 705 

Leu Tyr Gly Leu Ser Pro Ala Leu Thr Arg Tyr He Val Lys Trp 

710 715 720 

Glu Trp Val He Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val 

725 730 735 

Cys Ala Cys Leu Trp Met Leu Asn He Leu Gly Gin Ala Glu Ala 

740 745 750 

Ala Leu Glu Lys Leu He He Leu His Ser Ala Ser Ala Ala Ser 

755 760 765 

Ala Asn Gly Pro Leu Trp Phe Phe He Phe Phe Thr Ala Ala Trp 

770 775 780 

Tyr Leu Lys Gly Arg Val Val Pro Val Ala Thr Tyr Ser Val Leu 

785 790 795 

Gly Leu Trp Ser Phe Leu Leu Leu Val Leu Ala Leu Pro Gin Gin 

800 805 810 

Ala Tyr Ala Leu Asp Ala Ala Glu Gin Gly Glu Leu Gly Leu Ala 
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815 820 825 

He Leu Val lie lie Ser He Phe Thr Leu Thr Pro Ala Tyr Lys 

830 335 840 

lie Leu Leu Ser Arg Ser Val Trp Trp Leu Ser Tyr Met Leu Val 

845 850 855 

Leu Ala Glu Ala Gin He Gin Gin Trp Val Pro Pro Leu Glu Val 

860 _ 865 870 

Arg Gly Gly Arg Asp Gly He He Trp Val Ala Val He Leu His 

875 880 885 

Pro Arg Leu Val Phe Glu Val Thr Lys Trp Leu Leu Ala He Leu 

890 895 900 

Gly Pro Ala Tyr Leu Leu Arg Ala Ser Leu Leu Arg He Pro Tyr 

905 910 915 

Phe Val Arg Ala His Ala Leu Leu Arg Val Cys Thr Leu Val Lys 

920 925 930 

His Leu Ala Gly Ala Arg Tyr He Gin Het Leu Leu He Thr He 

935 940 945 

Gly Arg Trp Thr Gly Thr Tyr He Tyr Asp His Leu Ser Pro Leu 

950 955 960 

Ser Thr Trp Ala Ala Gin Gly Leu Arg Asp Leu Ala He Ala Val 

965 970 975 

Glu Pro Val Val Phe Ser Pro Het Glu Lys Lys Val He Val Trp 

980 985 990 

Gly Ala Glu Thr Val Ala Cys Gly Asp He Leu His Gly Leu Pro 

995 1000 1005 

Val Ser Ala Arg Leu Gly Arg Glu Val Leu Leu Gly Pro Ala Asp 

1010 1015 1020 

Gly Tyr Thr Ser Lys Gly Trp Asn Leu Leu Ala Pro He Thr Ala 

1025 1030 1035 
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Tyr Thr 6ln Gin Thr Arg Gly Leu Leu Gly Ala lie Val Val Ser 

1040 1045 1050 

Leu Thr Gly Arg Asp Lys Asn Glu Gin Ala Gly Gin Val Gin Val 

1055 1060 1065 

Leu Ser Ser Val Thr Gin Thr Phe Leu Gly Thr Ser He Ser Gly 

1070 1075 1080 

Val Leu Trp Thr Val Tyr His Gly Ala Gly Asn Lys Thr Leu Ala 

1085 1090 1095 

Gly Pro Lys Gly Pro Val Thr Gin Met Tyr Thr Ser Ala Glu Gly 

1100 1105 1110 

Asp Leu Val Gly Trp Pro Ser Pro Pro Gly Thr Lys Ser Leu Asp 

1115 1120 1125 

Pro Cys Thr Cys Gly Ala Val Asp Leu Tyr Leu Val Thr Arg Asn 

1130 1135 1140 

Ala Asp Val He Pro Val Arg Arg Lys Asp Asp Arg Arg Gly Ala 

1145 1150 1155 

Leu Leu Ser Pro Arg Pro Leu Ser Thr Leu Lys Gly Ser Ser Gly 

1160 1165 1170 

Gly Pro Val Leu Cys Ser Arg Gly His Ala Val Gly Leu Phe Arg 

1175 1180 1185 

Ala Ala Val Cys Ala Arg Gly Val Ala Lys Ser He Asp Phe He 

1190 1195 1200 

Pro Val Glu Ser Leu Asp He Ala Thr Arg Thr Pro Ser Phe Ser 

1205 1210 1215 

Asp Asn Ser Ala Pro Pro Ala Val Pro Gin Ser Tyr Gin Val Gly 

1220 1225 1230 

Tyr Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 

1235 1240 1245 

Ala Ala Tyr Ala Ser Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 



85 



EP 0 532 167 A2 



1250 1255 1260 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala 

1265 1270 1275 

His Gly He Asn Pro Asn He Arg Thr Gly Val Arg Thr Val Thr 

1280 1285 1290 

Thr Gly Asp Ser He Thr Tyr Ser Thr Tyr Gly Lys Phe He Ala 

1295 1300 1305 

Asp Gly Gly Cys Ala Ala Gly Ala Tyr Asp He He He Cys Asp 

1310 1315 1320 

Glu Cys His Ser Val Asp Ala Thr Thr He Leu Gly He Gly Thr 

1325 1330 1335 

Val Leu Asp Gin Ala Glu Thr Ala Gly Val Arg Leu Val Val Leu 

1340 1345 1350 

Ala Thr Ala Thr Pro Pro Gly Thr Val Thr Thr Pro His Ser Asn 

1355 1360 1365 

He Glu Glu Val Ala Leu Gly His Glu Gly Glu He Pro Phe Tyr 

1370 1375 1380 

Gly Lys Ala He Pro Leu Ala Phe He Lys Gly Gly Arg His Leu 

1385 1390 1395 

He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Ala 

1400 1405 1410 

Leu Arg Gly Thr Gly Val Asn Ala Val Ala Tyr Tyr Arg Gly Leu 

1415 1420 1425 

Asp Val Ser Val He Pro Thr Gin Gly Asp Val Val Val Val Ala 

1430 1435 1440 

Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val 

1445 1450 1455 

He Asp Cys Asn Val Ala Val Ser Gin He Val Asp Phe Ser Leu 

1460 1465 1470 
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Asp Pro Thr Phe Thr He Thr Thr Gin Thr Val Pro Gin Asp Ala 
1475 1480 1485 

Val Ser Arg Sen Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Leu 
1490 1495 1500 

Gly lie Tyr Arg Tyr Val Ser Ser Gly Glu Gly Pro Ser Gly Met 
1505 1510 1515 

Phe Asp Ser Val Val Pro Cys Glu Cys Tyr Asp Ala Gly Ala Ala 
1520 1525 1530 

Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala 
. 1535 1540 1545 

Tyr Phe Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu 
1550 1555 1560 

Phe Trp Glu Ala Val Phe Thr Gly Leu Thr His He Asn Ala His 
1565 1570 1575 

Phe Leu Ser Gin Thr Lys Gin Gly Gly Glu Asn Phe Ala Tyr Leu 
1580 1585 1590 

Thr Ala Tyr- Gin Ala Thr Val Cys Ala Arg Ala Lys Ala Pro Pro 
1595 1600 1605 

Pro Ser Trp Asp Val Het Trp Lys Cys Leu Thr Arg Leu Lys Pro 
1610 1615 1620 

Thr Leu Thr Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
1625 1630 ' 1635 

Thr Asn Glu Val Thr Leu Thr His Pro Val Thr Lys Tyr He Ala 
1640 1645 1650 

Thr Cys Het Gin Ala Asp Leu Glu He Hec Thr Ser Ser Trp Val 
1655 1660 1665 

Leu Ala Gly Gly Val Leu Ala Ala Val Ala Ala Tyr Cys Leu Ala 
1670 1675 1680 

Thr Gly Cys He Ser He He Gly Arg Leu His Leu Asn Asp Arg 
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1685 1690 1695 

Val Val Val Thr Pro Asp Lys Glu He Leu Tyr Glu Ala Phe Asp 

1700 1705 1710 

Glu Het Glu Glu Cys Ala Ser Lys Ala Ala Leu He Glu Glu Giy 

1715 1720 1725 

Gin Arg Het Ala Glu Het Leu Lys Ser Lys He Gin Gly Leu Leu 

1730 1735 1740 

Gin Gin Ala Thr Arg Gin Ala Gin Gly Het Gin Pro Ala He Gin 

1745 1750 1755 

Ser Ser Trp Pro Lys Leu Glu Gin Phe Trp Ala Lys His Het Trp 

1760 1765 1770 

Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu 

1775 1780 1785 

Pro Gly Asn Pro Ala Val Ala Ser Het Het Ala Phe Ser Ala Ala 

1790 1795 1800 

Leu Thr Ser Pro Leu Pro Thr Ser Thr Thr He Leu Leu Asn He 

1805 1810 1815 

Het Gly Gly Trp Leu Ala Ser Gin He Ala Pro Pro Ala Gly Ala 

1820 1825 1830 

Thr Gly Phe Val Val Ser Gly Leu Val Gly Ala Ala Val Gly Ser 

1835 1840 1845 

He Gly Leu Gly Lys He Leu Val Asp Val Leu Ala Gly Tyr Gly 

1850 1855 1860 

Ala Gly He Ser Gly Ala Leu Val Ala Phe Lys He Het Ser Gly 

1865 1870 1875 

Glu Lys Pro Thr Val Glu Asp Val Val Asn Leu Leu Pro Ala He 

1880 1885 1890 

Leu Ser Pro Gly Ala Leu Val Val Gly Val He Cys Ala Ala He 

1895 1900 1905 
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Leu Arg Arg His Vai Gly Gin Gly Glu Gly Ala Val Gin Trp Het 

1910 1915 1920 

Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val Ala Pro 

1925 1930 1935 

Thr His Tyr Val Val Glu Ser Asp Ala Ser Gin Arg Val Thr Gin 

1940 1945 1950 

Val Leu Ser Ser Leu Thr He Thr Ser Leu Leu Arg Arg Leu His 

1955 1960 1965 

Ala Trp He Thr Glu Asp Cys Pro He Pro Cys Ser Gly Ser Trp 

1970 1975 1980 

Leu Gin Asp He Trp Asp Trp Val Cys Ser He Leu Thr Asp Phe 

1985 1990 1995 

Lys Asn Trp Leu Ser Ser Lys Leu Leu Pro Lys Het Pro Gly He 

2000 2005 2010 

Pro Phe He Ser Cys Gin Lys Gly Tyr Lys Gly Val Trp Ala Gly 

2015 2020 2025 

Thr Gly Val Het Thr Thr Arg Tyr Pro Cys Gly Ala Asn He Ser 

2030 2035 2040 

Gly His Val Arg Het Gly Thr Het Lys He Thr Gly Pro Lys Thr 

2045 2050 2055 

Cys Leu Asn Leu Trp Gin Gly Thr Phe Pro He Asn Cys Tyr Thr 

2060 2065 2070 

Glu Gly Pro Cys Vai Pro Lys Pro Pro Pro Asn Tyr Lys Thr Ala 

2075 2080 2085 

He Trp Arg Val Ala Ala Ser Glu Tyr Val Glu Val Thr Gin His 

2090 2095 2100 

Gly Ser Phe Ser Tyr Val Thr Gly Leu Thr Ser Asp Asn Leu Lys 

2105 2110 2115 

Val Pro Cys Gin Val Pro Ala Pro Glu Phe Phe Ser Trp Val Asp 
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2120 2125 2130 

Gly Val Gin He His Arg Phe Ala Pro Val Pro Gly Pro Phe Phe 

2135 2140 2145 

Arg Asp Glu Val Thr Phe Thr Val Gly Leu Asn Ser Phe Val Val 

2150 2155 2160 

Gly Ser Gin Leu Pro Cys Asp Pro Glu Pro Asp Thr Glu Val Leu 

2165 2170 2175 

Ala Ser Het Leu Thr Asp Pro Ser His He Thr Ala Glu Ala Ala 

2180 2185 2190 

Ala Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Gin Ala Ser Ser 

2195 2200 2205 

Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 

2210 2215 2220 

Thr His Lys Thr Ala Tyr Asp Cys Asp Het Val Asp Ala Asn Leu 

2225 2230 2235 

Phe Het Gly Gly Asp Val Thr Arg He Glu Ser Asp Ser Lys Val 

2240 2245 2250 

He Val Leu Asp Ser Leu Asp Ser Het Thr Glu Val Glu Asp Asp 

2255 2260 2265 

Arg Glu Pro Ser Val Pro Ser Glu Tyr Leu He Lys Arg Arg Lys 

2270 2275 2280 

Phe Pro Pro Ala Leu Pro Pro Trp Ala Arg Pro Asp Tyr Asn Pro 

2285 2290 2295 

Val Leu He Glu Thr Trp Lys Arg Pro Gly Tyr Glu Pro Pro Thr 

2300 2305 2310 

Val Leu Gly Cys Ala Leu Pro Pro Thr Leu Gin Thr Pro Val Pro 

2315 2320 2325 

Pro Pro Arg Arg Arg Arg Ala Lys He Leu Thr Gin Asp Asp Val 

2330 2335 2340 
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Glu Gly He Leu Arg Glu Hat Ala Asp Lys Val Leu Ser Pro Leu 

2345 2350 2355 

Gin Asp Asn Asn Asp Ser Gly His Ser Thr Gly Ala Asp Thr Gly 

2360 2365 2370 

Gly Asp He Val Gin Gin Pro Ser Asp Glu Thr Ala Ala Ser Glu 

2375 2380 2385 

Ala Gly Ser Leu Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly 

2390 2395 2400 

Asp Pro Asp Leu Glu Phe Glu Pro Val Gly Ser Ala Pro Pro Ser 

2405 2410 2415 

Glu Gly Glu Cys Glu Val He Asp Ser Asp Ser Lys Ser Trp Ser 

2420 2425 2430 

Thr Val Ser Asp Gin Glu Asp Ser Val He Cys Cys Ser Het Ser 

2435 2440 2445 

Tyr Ser Trp Thr Gly Ala Leu He Thr Pro Cys Gly Pro Glu Glu 

2450 2455 2460 

Glu Lys Leu Pro He Asn Pro Leu Ser Asn Ser Leu Het Arg Phe 

2465 2470 2475 

His Asn Lys Val Tyr Ser Thr Thr Ser Arg Ser Ala Ser Leu Arg 

2480 2485 2490 

Ala Lys Lys Val Thr Phe Asp Arg Val Gin Val Leu Asp Ala His 

2495 2500 2505 

Tyr Asp Ser Val Leu Gin Asp Val Lys Arg Ala Ala Ser Lys Val 

2510 2515 2520 

Gly Ala Arg Leu Leu Thr Val Glu Glu Ala Cys Ala Leu Thr Pro 

2525 2530 2535 

Pro His Ser Ala Lys Ser Arg Tyr Gly Phe Gly Ala Lys Glu Val 

2540 2545 2550 

Arg Ser Leu Ser Arg Arg Ala Val Asn His He Arg Ser Val Trp 
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2555 2560 2565 

Glu Asn Leu Leu Glu Asp Gin Arg Thr Pro He Asp Thr Thr He 

2570 2575 2580 

Met Ala Lys Asn Glu Val Phe Cys He Asp Pro Thr Lys Gly Gly 

2585 2590 2595 

Lys Lys Pro Ala Arg Leu He Val Tyr Pro Asp Leu Gly Val Arg 

2600 2605 2610 

Val Cys Glu Lys Met Ala Leu Tyr Asp He Thr Gin Lys Leu Pro 

2615 2620 2625 

Lys Ala He Het Gly Pro Ser Tyr Gly Phe Gin Tyr Ser Pro Ala 

2630 2635 2640 

Glu Arg Val Asp Phe Leu Leu Lys Ala Trp Gly Ser Lys Lys Asp 

2645 2650 2655 

Pro Het Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 

2660 2665 2670 

Thr Glu Arg Asp He Arg Thr Glu Glu Ser He Tyr Gin Ala Cys 

2675 2680 2685 

Ser Leu Pro Gin Glu Ala Arg Thr Val He His Ser Leu Thr Glu 

2690 2695 2700 

Arg Leu Tyr Val Gly Gly Pro Met Thr Asn Ser Lys Gly Gin Ser 

2705 2710 2715 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser 

2720 2725 2730 

Het Gly Asn Thr Het Thr Cys Tyr He Lys Ala Leu Ala Ala Cys 

2735 2740 2745 

Lys Ala Ala Gly He Val Asp Pro Val Het Leu Val Cys Gly Asp 

2750 2755 2760 

Asp Leu Val Vai He Ser Glu Ser Gin Gly Asn Glu Glu Asp Glu 

2765 2770 2775 
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Arg Asn Leu Arg Ala Phe Thr Qlu Ala Met Thr Arg Tyr Ser Ala 

2780 2785 2790 

Pro Pro Gly Asp Leu Pro Arg Pro Glu Tyr Asp Leu Glu Leu He 

2795 2800 2805 

Thr Ser Cys Ser Ser Asn Val Ser Val Ala Leu Asp Ser Arg Gly 

2810 2815 2820 

Arg Arg Arg Tyr Phe Leu Thr Arg Asp Pro Thr Thr Pro He Thr 

2825 2830 2835 

Arg Ala Ala Trp Glu Thr Val Arg His Ser Pro Val Asn Ser Trp 

2840 2845 2850 

Leu Gly Asn He He Gin Tyr Ala Pro Thr He Trp Val Arg Met 

2855 2860 2865 

Val He Met Thr His Phe Phe Ser He Leu Leu Ala Gin Asp Thr 

28 70 2 8 75 2 8 80 

Leu Asn Gin Asn Leu Asn Phe Glu Met Tyr Gly Ala Val Tyr Ser 

2885 2890 2895 

Val Asn Pro Leu Asp Leu Pro Ala He He Glu Arg Leu His Gly 

2900 2905 2910 

Leu Glu Ala Phe Ser Leu His Thr Tyr Ser Pro His Glu Leu Ser 

2915 2920 2925 

Arg Val Ala Ala Thr Leu Arg Lys Leu Gly Ala Pro Pro Leu Arg 

2930 2935 2940 

Ala Trp Lys Ser Arg Ala Arg Ala Val Arg Ala Ser Leu He Ala 

2945 2950 2955 

Gin Gly Ala Arg Ala Ala He Cys Gly Arg Tyr Leu Phe Asn Trp 

2960 2965 2970 

Ala Val Lys Thr Lys Leu Lys Leu Thr Pro Leu Pro Glu Ala Ser 

2975 2980 2985 

Arg Leu Asp Leu Ser Gly Trp Phe Thr Val Gly Ala Gly Gly Gly 
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